# Information structure in spoken Japanese

Particles, word order, and intonation

Natsuko Nakagawa

Topics at the GrammarDiscourse Interface 8

### Topics at the GrammarDiscourse Interface

Editors: Philippa Cook (University of Potsdam), Anke Holler (University of Göttingen), Cathrine FabriciusHansen (University of Oslo)

In this series:


# Information structure in spoken Japanese

Particles, word order, and intonation

Natsuko Nakagawa

Nakagawa, Natsuko. 2020. *Information structure in spoken Japanese*: *Particles, word order, and intonation* (Topics at the Grammar-Discourse Interface 8). Berlin: Language Science Press.

This title can be downloaded at: http://langsci-press.org/catalog/book/178 © 2020, Natsuko Nakagawa Published under the Creative Commons Attribution 4.0 Licence (CC BY 4.0): http://creativecommons.org/licenses/by/4.0/ ISBN: 978-3-96110-138-2 (Digital) 978-3-96110-139-9 (Hardcover)

ISSN: 2567-3335 DOI: 10.5281/zenodo.4291753 Source code available from www.github.com/langsci/178 Collaborative reading: paperhive.org/documents/remote?type=langsci&id=178

Cover and concept of design: Ulrike Harbort Typesetting: Natusko Nakagawa, Sebastian Nordhoff Proofreading: Brett Reynolds, Carla Bombi, Claudia Marzi, Eran Asoulin, Katja Politt, Marina Frank, Mykel Brinkerhoff, Jean Nitzke, Matthew Weber, Piklu Gupta Fonts: Libertinus, Arimo, DejaVu Sans Mono Typesetting software: XƎLATEX

Language Science Press xHain Grünberger Str. 16 10243 Berlin, Germany langsci-press.org

Storage and cataloguing done by FU Berlin





# **Acknowledgments**

This work is based on my PhD thesis, *Information Structure in Spoken Japanese: Particles, Word Order, and Intonation*, submitted in 2016 to the Graduate School of Human and Environmental Studies, Kyoto University. In 2005, I started as a graduate student of linguistics in Kyoto University, and more than 10 years have passed since then. I cannot believe that I concentrated on a single topic for this long time, and, on top of that, I cannot believe that so many people patiently and kindly encouraged and helped me to finish the thesis.

First, I am most grateful to my advisor, Yuji Togo. Since I started my graduate education, he taught me a lot of important things, suggested interesting issues related to my interests, gave me precise advice, and showed me how a scientist should be through his research and his lectures. Without him, I could not have even imagined finishing the thesis. He encouraged me to do things that seemed impossible for me to achieve.

Also, I thank Yukinori Takubo and Koji Fujita, who refereed my dissertation. Their detailed questions and comments shed light on new issues of my thesis from different perspectives.

In my early career in the graduate school in Kyoto University, I met many professors and colleagues who shared interesting topics with me, advised my research, and showed their own exciting work: Masa-Aki Yamanashi, Daisuke Yokomori, Yoshihiko Asao, Masanobu Masuda, Akihiro Yamazaki, Yukinori Kimoto, Chris Davis, Tomoko Endo, and many others who studied with me. I was very lucky to see them. It is unfortunate that I can only name a few people here, but I thank all of them.

I learned a lot through investigating the same topic and writing a paper with Yoshihiko Asao, Naonori Nagaya, and Daisuke Yokomori. In particular, the works I did with Yoshihiko Asao and Naonori Nagaya, triggered by lectures at the LSA Summer Institute 2007 at Stanford University, influenced my research methods and research topics. Also, I thank Yasuharu Den and his colleagues, Katsuya Takanashi, Hanae Koiso, Mika Enomoto, Kikuo Maekawa, and many others, who inspired me and expanded my view on linguistic research. Also, I owe a lot to Yasuharu Den, who helped me to conduct statistical analysis throughout the the-

### Acknowledgments

sis. Potential problems, mistakes, and misunderstandings are due to my limited knowledge of statistics.

Chapter 4, in particular §4.3, is based on my master's thesis submitted to State University of New York, Buffalo. I also thank the faculty members and colleagues there. In particular, I am grateful to Matthew Dryer, who gave me insightful comments on my master's thesis.

Last but not least, I am indebted to reviewers of Language Science Press, who later revealed their identities: Yukiko Morimoto and Satoshi Imamura. They devoted their time to giving detailed comments and questions and helped this work to be further refined. Furthermore, I am grateful for those who edited and corrected this book.

While I was writing my dissertation, I was funded by Long-Term Study Abroad Program of Japan Student Services Organization, and the Japan Society for the Promotion of Science (15J03835).

# **Abstract**

This book investigates the associations between information structure and linguistic forms in spoken Japanese mainly by analyzing spoken corpora. It proposes multi-dimensional annotation and analysis procedures for spoken corpora and explores the relationships between information structure on the one hand and particles, word order, and intonation on the other.

Particles, word order, and intonation in spoken Japanese have been investigated separately in different frameworks and in different subfields of the literature; there was no unified theory accounting for the all the phenomena. This book provides a unified investigation of all the phenomena in question, by annotating all target expressions according to the same criteria and by investigating them all from the same analytical framework. Chapter 1 outlines the questions to be investigated in the study and introduces the methodology of the book. Chapter 2 reviews the literature on Japanese linguistics as well as the literature on information structure in different languages. Chapter 3 proposes the analytical framework of the book. Major findings are discussed in Chapters 4, 5, and 6.

Chapter 4 analyzes the distributions of topic and case particles. It is made clear that so-called topic particles (*wa*, zero particles, *toiuno-wa*, and *kedo/ga* preceded by copula) are mainly sensitive to the given-new taxonomy, whereas case particles (*ga*, *o*, and the zero particles) are sensitive to both focushood and grammatical function. While the distinction between *wa* and *ga* has attracted much attention in traditional Japanese linguistics, this book analyzes the distribution of different kinds of topic and case particles, including zero particles.

Chapter 5 studies word order; more specifically, clause-initial, pre-predicate, and post-predicate noun phrases. Topical NPs appear either clause-initially or post-predicatively, while focal NPs appear pre-predicatively. Clause-initial and post-predicate NPs differ from each other mainly in their status in the givennew taxonomy. The previous literature investigated clause-initial, pre-predicate, and post-predicate constructions from different frameworks; however, there was no unified account of word order in Japanese. The book outlines an account of word order in spoken Japanese within a unified framework.

### Abstract

Chapter 6 investigates intonation. While the previous literature mainly concentrated on contrastive focus, this book discusses intonation from the perspective of both topic and focus. It is argued that intonation corresponds to a unit of processing and that information structure influences the form of the intonation units.

Chapter 7 discusses the theoretical implications of these findings. Finally, Chapter 8 summarizes the book and points out some remaining issues and possible future studies.

# **1 Introduction**

### **1.1 Aims of the study**

The goal of this study is twofold. First, I will investigate the relationships between information structure and linguistic forms in spoken Japanese. Second, I will propose a method to investigate this kind of relations in any language by using corpora.

Speakers of Japanese, like speakers of many other languages, infer other people's knowledge and express their assumptions about it using various linguistic and non-linguistic tools. Consider a conversation between three people, A, B, and C, from *the Chiba three-party conversation corpus* (Den & Enomoto 2007). In (1–A1), one of the participants, A, starts talking about *ano koohii-meekaa* 'that coffee machine'. In B2 to B4, B, explains why A started to talk about it; it is related to the previous topic (too many people gathered in a small room). C just adds a weak backchannel response in C5. In A6–A7, A asks C whether she knows about the new coffee machine that arrived in building E. In C8–C11, C answers to A that she knows about it but has never tried it.<sup>1</sup>

(1) A1: ano that **koohii-meekaa** coffee-maker sugoi-yo-ne great-fp-fp 'That coffee machine is excellent, isn't it?' B2: **koohii-meekaa**-o coffee-maker-acc mi-tai see-want '(I) wanna see the coffee machine.' B3: tukat-teru-no-o use-pfv-nmlz-acc mi-tai-tte see-want-quot iu-no-to say-nmlz-and '(They) want to see (us) use (the coffee machine), and' B4: koohii coffee nom-e-nai drink-cap-neg san-nin-gumi-mo three-cl.person-group-also ita-kara exist-because otya tea non-de-ta drink-prog-past 'since there were also three people who cannot drink coffee, they drank tea.'

<sup>1</sup> Some of the utterances were omitted for the sake of simplicity.


From this short conversation, observers (namely, we) can infer that A in A1 assumed that the other participants already knew about the great coffee machine that was introduced in their lab. One can also infer that B in B2–B4 already knew about the coffee machine. In A6–A7, A appears to think that C might not know about the coffee machine. However, C in C8 explicitly denies A's concern.

Why is it possible for us to infer the speakers' assumptions about the knowledge of other participants? In this case, linguistic expressions such as *ano (koohii meekaa)* 'that (coffee machine)' in A1 and *sit-teru:* '(do you) know...?' in A6 indicate A's assumptions about the other participants' knowledge.

This study investigates more subtle linguistic expressions than these determiners in spoken Japanese, namely particles, word order, and intonation. As an example, let us discuss the distinction between the particles *ga* and *wa*, that has been discussed for a long time in the literature on Japanese linguistics. Examples (2-a), containing the particle *ga*, and (2-b), containing the particle *wa*, express the

### 1.2 Background

same proposition ('A/the dog is running'), where definiteness is not explicit in the original Japanese sentences. The expression *inu* 'dog' followed by *ga* in (2-a) can be interpreted as either definite or indefinite, while the same expression followed by *wa* in (2-b) can only be interpreted as definite: from (2-b) we can infer that the speaker assumes that the hearer already knows about the dog.

	- dog-top run-prog 'The dog is running.' (Constructed)

As will be discussed in Chapter 4, however, it is not the case that the NP coded by *wa* is always definite, nor is it the case that the NP coded by *ga* is always indefinite. What determines the usage of the particles? Moreover, particle choice interacts with other factors such as word order and intonation. This study investigates how information structure affects particle choice, word order, and intonation employing a corpus of spoken Japanese.

### **1.2 Background**

Information structure in this study comprises "the utterance-internal structural and semantic properties reflecting the relation of an utterance to the discourse context, in terms of the discourse status of its content, the actual and attributed attentional status of the discourse participants, and the participants' prior and changing attitudes (knowledge, beliefs, intentions, expectations, etc.)" (Kruijff-Korbayová & Steedman 2003: 250). I assume that information structure is a subordinate part of discourse structure, which is a clause-level unit and does not allow recursivity. Also, I assume that information structure should be analyzed at the surface level rather than at the level of underlying semantics (or logical form).

Studies on information structure can be brought back to two sources (see Kruijff-Korbayová & Steedman (2003) for a useful survey). One originates in the studies on definite and indefinite descriptions by Russell (1905) and Strawson (1950; 1964). These studies triggered the discussion on presupposition and assertion which are still a matter of debate now. In particular, this line of research has influenced contemporary scholars of logic, formal semantics, and generative grammar (Chomsky 1965; Jackendoff 1972; Selkirk 1984; Rooth 1985; Rizzi 1997;

### 1 Introduction

Erteschik-Shir 1997; 2007; Büring 2007; Ishihara 2011; Krifka & Musan 2012; Endo 2014). The other source originates from the Prague School (Mathesius 1928; 1929; Sgall 1967; Firbas 1975), whose studies have particularly inspired functional linguistics (Bolinger 1965; Halliday 1967; Kuno 1973b; Gundel 1974; Chafe 1976; 1994; Prince 1981; Givón 1983; Tomlin 1986; Lambrecht 1994; Birner & Ward 1998; 2009). Some scholars were influenced by both of these traditions (Vallduvı́1990; Steedman 1991; Vallduví & Vilkuna 1998).

Almost independently from this European and American tradition of linguistics, Japanese linguistics focused its attention on the so-called topic particle *wa* in Japanese, often as opposed to the case particle *ga* (Matsushita 1928; Yamada 1936; Tokieda 1950/2005; Mikami 1953/1972; 1960; Onoe 1981; Kinsui 1995; Kikuchi 1995; Noda 1996; Masuoka 2000; 2012). In addition to its use, the discussion on *wa* also elicited the question on the nature of the subject because, on the surface, *wa* frequently alternates with *ga*, the so-called subject particle. See Chapter 2 for details.

Recently, more studies have investigated the actual production and understanding of language rather than just the acceptability judgements of constructed examples. Corpus-oriented studies (e.g., Calhoun et al. 2005; Götze et al. 2007; Chiarcos et al. 2011) inherit from the two information structure traditions: the logical tradition and the functional one. Other corpus-oriented studies such as Hajičová et al. (2000), annotating Czech, are based on the work of the Prague School. There are also questionnaires for eliciting expressions related to information structure cross-linguistically (Skopeteas et al. 2006). Further, Cowles (2003) and Cowles & Ferreira (2012) investigate information structure mainly by employing psycholinguistic experiments.

I am mostly influenced by the traditions of functional linguistics and corpus linguistics. Although I tried to include the work of other traditions as much as possible, sometimes readers from other schools might have difficulties understanding my assumptions. I assume that usage shapes a language (Givón 1976; Comrie 1983; 1989; Bybee & Hopper 2001) and am interested in how linguistic usage affects its shape. In this study, I focus on the question of how language usage related to information structure affects linguistic form in Japanese.

### **1.3 Methodology**

I investigate linguistic forms associated with information structure in spoken Japanese mainly by examining spoken corpora. It is well known that information structure phenomena are so subtle that slight changes in the context can af-

### 1.4 Overview

fect the judgement of the sentence in question, meaning that acceptability judgements from a single person (i.e., the author) are not reliable. This is the reason why I employ spoken corpora, in which the speakers produce utterances naturally without concentrating on information structure too much like linguists. Moreover, contexts are available in spoken corpora, which are crucial for observers to determine the information structure of a sentence. It is also well known, however, that information structure annotation is very hard. There are studies on the annotation of information structure for various types of corpus and for different languages (Hajičová et al. 2000; Calhoun et al. 2005; Götze et al. 2007; Ritz et al. 2008; Chiarcos et al. 2011). Some use syntactic information to decide the information structure of a sentence (Hajičová et al. 2000); some use intonation (Calhoun et al. 2005); others use linguistic tests (Götze et al. 2007; Chiarcos et al. 2011); but many studies decide on the basis of several features. For example, in annotating "aboutness topic", Götze et al. (2007) employ not only tests such as whether the NP in question can be the answer to the question "let me tell you something about X", but also morphological information of the NP such as referentiality, definiteness, genericity, etc. In the present work, I annotate multiple features of topichood and focushood, rather than annotating homogeneous "topic" and "focus" categories. I consider a topic to be a cluster of features, comprising "presupposed, "evoked, "definite", "specific", "animate", etc. I also see focus as a cluster of features, comprising "asserted", "brand-new", "indefinite", "nonspecific", "inanimate", etc. I assume that topic and focus typically (frequently) have these features, but that these are not always all necessarily present. There could be infrequent (i.e., atypical) topics that are indefinite or inanimate, or there could be foci that are definite or animate. See discussion in Chapter 3 for details.

I sometimes employ acceptability judgements and production experiments to support my argument. I believe that, in the future, it will be necessary to test all the hypotheses using multiple methods for a scientific investigation of language.

### **1.4 Overview**

I will now outline the chapters of this book. In Chapter 2, I provide an overview of the previous studies on information structure across languages. I also describe the basic features of Japanese and review studies on Japanese related to this study. In Chapter 3, I outline the framework employed in the study; the notions of topic, focus, and features related to them. Moreover, I introduce the nature of the corpora, the annotation procedure, and the methods employed to analyze the results. The following three chapters analyze linguistic forms found in spoken Japanese.

### 1 Introduction

Chapter 4 investigates particles, Chapter 5 analyzes word order, and Chapter 6 inquires into intonation. In Chapter 7, I summarize the study and discuss its theoretical implications.

# **2 Background**

### **2.1 Introduction**

This chapter provides an overview of various definitions of (or notions frequently associated with) topics (§2.2) and foci (§2.3). In each section, I first introduce the definition of topic and focus used in this study. Then I review the literature. Topic is roughly equivalent to "psychological subject" (von der Gabelentz 1869), "theme" (e.g., Daneš 1970; Halliday 2004), "ground", "background", and "link" (Vallduvı́1994), although there are many (sometimes crucial) differences among these notions. In the same manner, focus is roughly equivalent to "psychological predicate", "rheme", "foreground", and "comment". Gundel (1974) and Kruijff-Korbayová & Steedman (2003) provide a useful summary of the history of these notions.

In reviewing the literature, I emphasize two aspects: the importance of the definition of topic and focus proposed in the study and, at the same time, the heterogeneous characteristics of these notions. The present study argues that topics and foci in different languages form prototype categories composed of various features that are present to different degrees. This position is similar to Firbas (1975) and Givón (1976), who viewed topic as a gradient notion, although the features they propose are not exactly the same. Also, I assume a single flat layer of information structure with multiple features, rather than the multiple layers assumed by many researchers (such as the topic-comment vs. focus-background layers).

Finally, in §2.4 I review the literature on Japanese particles, word order, and intonation.

### **2.2 Topic**

In this section, I give a brief overview of the definitions of topic. The notion of topic is controversial and has a complicated history. I classify these complicated notions into several representative categories in the following subsections. Before the overview, I first introduce the definition of topic assumed in this study to make the discussion clearer.

### 2 Background

### **2.2.1 The definition of topic in this study**

Since I assume that information structure is a cognitive notion, I define topic from a cognitive standpoint. The definition is stated in (1).

(1) The topic is a discourse element that the speaker assumes or presupposes to be shared (known or taken for granted) and uncontroversial in a given sentence both by the speaker and the hearer.

This definition follows and elaborates the idea of topics (*daimoku-tai* 'topic form') in Matsushita (1928), who states that "the theme of judgement [topic] should not be changed before the judgement" (p. 774, translated by NN). Also, he states that the topic is "determinate" (p. 775).

In terms of the given-new taxonomy proposed by Prince (1981), shown in (2), topics defined in (1) include unused, declining (to be discussed below), inferable, and evoked elements (Lambrecht 1994: §4.4.2).<sup>1</sup> By the statement that topics are "shared", I mean that topics are either unused, declining, inferable, or evoked.

(modified from Prince 1981: 237)

A new element refers to an entity first introduced by the speaker into the discourse; in other words, "[the speaker] tells the hearer to 'put it on the counter"' (Prince 1981: 235). A brand-new element refers to a new entity that "the hearer may have had to create" (ibid.). There are two types of brand-new elements: anchored and unanchored. "A discourse entity is Anchored if the NP representing

<sup>1</sup> Inferable elements are further divided into containing and non-containing, and evoked elements are divided into textually and situationally evoked. I omit these distinctions since they are irrelevant to the discussion.

### 2.2 Topic

it is linked, by means of another NP, or 'Anchor', properly contained in it, to some other discourse entity" (op.cit.: 236). According to Prince, "*a bus* [...] is Unanchored, or simply Brand-New, whereas *a guy I work with* [...], containing the NP *I*, is Brand-new Anchored, as the discourse entity the hearer creates for this particular guy will be immediately linked to his/her discourse entity for the speaker" (ibid.). An unused element refers to an entity "the hearer may be assumed to have a corresponding entity in his/her own model and simply has to place it in (or copy it into) the discourse-model" (ibid.) such as *Noam Chomsky*. An NP refers to an evoked entity "if [the] NP is uttered whose entity is already in the discourse-model, or 'on the counter"' (ibid.). "A discourse entity is Inferable if the speaker assumes the hearer can infer it, via logical – or, more commonly, plausible – reasoning, from discourse entities already Evoked or from other Inferables" (ibid.).

In addition, I include what I call "declining elements" (Prince 1981) in the taxonomy. A declining element refers to an entity which has been mentioned a while ago but is assumed to be declining in the hearer's mind because it has not been referred to for a while. Declining elements are assumed to be in a semi-active state in terms of Chafe (1987; 1994). The referents of declining elements are in a semi-active state especially through "deactivation from an earlier active state" (Chafe 1987: 29). Chafe's concept of "semi-active" also includes inferable entities. I introduce a new term in order to distinguish declining from inferable entities.

Note that the condition that the speaker assumes the element to be shared is a necessary but not a sufficient condition of topic; topics are assumed by the speaker to be shared with the hearer, but it is not necessarily the case that all shared elements are topics. The topic element must also be assumed to be uncontroversial, and I argue that this is a necessary and sufficient condition for topic, (see §3.3.1 for details).

Also note that the definition of topic in (1) includes the heterogeneous elements in (2). Therefore, definition (1) does not necessarily contradict the definitions proposed in the previous literature. Rather, it includes many of the previous definitions and restates them in terms of a cognitive viewpoint.

In the following sections, I provide a brief overview of different notions of topic proposed in the previous literature, and compare them with the notion I propose in the present study.

### **2.2.2 Aboutness**

One of the most representative definitions of topic is that a topic is what the sentence is about. This definition is employed by various linguists such as Mat-

### 2 Background

sushita (1928); Kuno (1972); Gundel (1974); Reinhart (1981); Dik (1978); Lambrecht (1994); and Erteschik-Shir (2007). Topics as things under discussion (e.g., Heycock 2008) are also classified here. Here I will discuss Reinhart (1981) because it is one of the most detailed and influential works.

Reinhart (1981), inspired by Strawson (1964), posits that topics should be characterized in terms of *aboutness*. More precisely, "an expression will be understood as representing the topic if the assertion is understood as intending to expand our knowledge of this topic" (Reinhart 1981: 59).<sup>2</sup> Moreover, the truth value of a sentence is assessed with respect to the topic (ibid.). She proposes some tests to identify a topic in a sentence. The first one is an *as for*/*regarding* test; an expression X is a topic if it is felicitously paraphrased as {*as for*/*regarding*} X (p. 63, see also Kuno (1972; 1976); Gundel (1974)). Therefore, *Matilda* in (3-a) and *your second proposal* (3-b) are topics.

	- b. **Regarding your second proposal**, the board has found it unfeasible. (Reinhart 1981: 59)

As she cautions, however, not all topics can be identified in this way because *as for* and *regarding* are typically used to change the current topic (Keenan & Schieffelin 1976; Duranti & Ochs 1979). For example, *as for this book* in (4) is awkward even though it is clearly a topic. This is because the book has already been the topic of the previous sentence.

(4) Kracauer's book is probably the most famous ever written on the subject of the cinema. ??**As for this book**, many more people are familiar with its catchy title then[sic] are acquainted with its [turgid] text. (Reinhart 1981: 64)

Therefore, she proposes a "more reliable test" (ibid.), which embeds the sentence in question in *about* sentences. This is exemplified in (5), where the book is correctly identified as a topic.

(5) He said **{about/of} the book** that many more people are familiar with its catchy title than are acquainted with its turgid text. (op. cit., 65)

<sup>2</sup>Although Reinhart's definition of topic is basically from Strawson, the discussion in this work is based on Reinhart (1981). This is because she notes that her "presentation of [the criteria of topics] may not be fully loyal to [Strawson's] original intentions" since "[Strawson's] criteria are introduced in a rather parsimonious manner" (59).

### 2.2 Topic

To formalize this intuition, Reinhart introduces the notion of possible pragmatic assertions. It is assumed that "each declarative sentence is associated with a set of possible pragmatic assertions (PPA), which means that that sentence can be used to introduce the content of any of these assertions into the context set" (p. 80). The context set of a given discourse at a given point is a set of propositions that both the speaker and the hearer have accepted to be true at that point (Stalnaker 1978). The set of PPA's of a given sentence S is defined in (6), where indicates the proposition expressed by S.

(6) PPA() = together with [< , >: is the interpretation of an NP expression in S] (Reinhart 1981: 80-81)

Assuming (6), the topic expression of a sentence S in a context C is defined as in (7).

(7) Topic is "the expression corresponding to in the pair < , > of PPA() which is selected in C". (op. cit., 81)

This is achieved in the following steps: (i) "if possible, the proposition expressed in S will be assessed by the hearer in C with respect to the subset of propositions already listed in the context set under ", and (ii) "if is not rejected it will be added to the context set under the entry " (ibid.).

Since this definition of topic in terms of aboutness is attractive and seems to coincide with our intuition, many linguists adopt it (e.g., Lambrecht 1994; Erteschik-Shir 2007). However, I do not employ this definition even though my criteria for topics in (1) and Reinhart's (7) are apparently very similar, and even though the elements covered by these two definitions overlap most of the time. Given that I am interested in finding topic expressions in corpora, aboutness is not clear enough for my purpose. For example, Vallduvı́(1994) presents the following hypothetical mini-conversation between a newly-appointed White House butler (H<sup>1</sup> ) and the Foreign Office Secretary after returning from a trip to Europe (S<sup>0</sup> ).

	- S0 : Yes. [The president] [hates the Delft china set].

(Vallduvı́1994: 9, 12)

In this example, Vallduví identifies *hates the Delft china set* as focus; however, it passes the *about* test as shown in (9).

### 2 Background

(9) The Foreign Office Secretary said **about the Delft china set** that the president hates it.

Since I am assuming that topics are in complementary distribution with focus elements, the element in question is not a focus if it is a topic, and vice versa.

On the other hand, the *no*- and *aha*-tests proposed in §3.3.1 correctly identify *the president* as a topic and *the Delft china set* as a focus. As shown in (10-H<sup>2</sup> ) and (11-H<sup>2</sup> ), the topic *the president* cannot be argued against or repeated as news, whereas the focus *the Delft china set* can.

	- S0 : Yes. [The president] [hates the Delft china set].
	- H2 : ?No, **the first lady** hates the Delft china set.
	- H ′ 2 : No, the president hates **Rockingham Pottery**.
	- S0 : Yes. [The president] [hates the Delft china set].
	- H2 : ?Aha, *the president*.
	- H ′ 2 : Aha, **the Delft china set**.

Therefore, I conclude that the definition in (1) identifies topics better than the aboutness test, even though aboutness captures some aspects of our intuition about topics.

### **2.2.3 Evokedness**

Evoked information is commonly called "given" or "old" information. However, as pointed out in Prince (1981), the terms "given" and "old" are too ambiguous. Following Prince, I use the term "evoked information" for a referent that has been mentioned in the previous discourse or has been physically present in the speaker's and hearer's attention and hence "in the consciousness of the addressee [(or the hearer)] at the time of utterance" (Chafe 1976: 30). The term "the focus (center) of attention", "anaphoric", "predictable" (Kuno 1972), and "active" (Portner 2007) are understood in the same way.

Most researchers agree that evoked information is not the topic itself (Reinhart 1981; Gundel 1988; Lambrecht 1994: *inter alia*). As it is well known, evoked elements can be a focus instead of a topic, as shown in (12-B).

2.2 Topic

(12) A: Who did Felix praise?

B: [Felix praised] [himself.]

(Reinhart 1981: 72, style modified by NN)

In (12-B), it is obvious that *himself* is evoked information since the referent is mentioned in the previous context as well as the sentence itself. At the same time, it is a focus because it is the answer to the *wh*-question (see also the discussion on focus in §2.3 below). Given that foci cannot be topics, *himself* in (12-B) is not a topic.

Moreover, as has been pointed out by many scholars (see Li 1976; Givón 1983; Halliday 2004: *inter alia*), topics are frequently evoked, but this is not always the case.

### **2.2.4 Subject**

As pointed out in Li (1976), topics are frequently, but not always, subjects. For example, the whole utterance in (13-a-d) can be the answer to the question "what happened?", indicating that the subjects in these utterances are part of the focus, and therefore cannot be a topic.

### (13) What happened?


(Gundel 1974: 49, modified by NN)

Topics are not always subjects, either. Objects and other elements can also be topics. In (14), the object of each sentence is a topic. The information structure is annotated by the author; note, however, that a context would be necessary to clarify the information structure in this example.

	- b. [As for that dress] , I promise I won't wear [it.]
	- c. (What about) [beans] , does he like [them?]

(Gundel 1974: 27, modified by NN)

However, it is also important to note that topics are frequently subjects (Li 1976).

### 2 Background

### **2.2.5 Sentence-initial elements**

Chomsky (1965) and Halliday (1967) characterize topics as the sentence-initial element (more recently, see Hajičová et al. (2000)). To define the topic in terms of linguistic form pre-empts the goal of this study, namely, to figure out the association between information structures (topic and focus) and linguistic forms (particles, word order, and intonation).

Moreover, there are cases where sentence-initial elements are not topics. For example, the sentences in (13) in the last section are topicless, meaning that the sentence-initial elements cannot be topics. Conversely, topics do not always appear sentence-initially:

(15) (What about the proposal?) – [Archie rejected] [{it/the proposal}.]

We will examine topics which appear after the predicate in Chapter 5. As will be discussed, topics frequently appear sentence-finally in casual spoken Japanese and in many other languages, and in this position have their own characteristics.

### **2.3 Focus**

In this section, I review different definitions of focus, as well as notions closely associated with it. Like topic, focus is also a controversial notion and the literature disagrees on its definition as well as its properties. In the following subsections, I again classify different definitions of focus into representative groups, but discuss my own definition of the term first for clarity.

### **2.3.1 The definition of focus in this study**

Since I try to capture phenomena of information structure in a single layer, I believe that topic and focus should be mutually exclusive rather than overlapping with each other, as has been mentioned above. Therefore, I define the notion of focus as in (16) (see also the discussion in §3.3.2).

(16) The focus is a discourse element that the speaker assumes to be news to the hearer and possibly controversial. S/he wants the hearer to learn the relation of the presupposition to the focus by his/her utterance. In other words, focus is an element that is asserted.

Like (1), this definition also follows and elaborates the idea of focus (*heisetsu-tai* 'plain form') in Matsushita (1928). He states that "whereas the theme of judgement [topic] should not be changed before the judgement, materials to be used

### 2.3 Focus

for the judgement [focus] are indeterminate, variate, and free since the speaker uses these materials at his/her own choice" (p. 774, translated by NN).

I believe the statement that the speaker "wants the hearer to learn the relation of the presupposition to the focus" in (16) is essentially the same as the definition of comment in Gundel (1988), which states as follows.

(17) A predication, P, is the comment of a sentence, S, iff in using S the speaker intends P to be assessed relative to the topic of S. (Gundel 1988: 210)

Lambrecht (1994) (based on Halliday 1967) also employs the same definition of focus as stated in (18).

(18) [T]he focus of a sentence, or more precisely, the focus of the proposition expressed by a sentence in a given utterance context, is seen as the element of information whereby the presupposition and the assertion *differ* from each other. The focus is that portion of a proposition which cannot be taken for granted at the time of speech. It is the *unpredictable* or pragmatically *non-recoverable* element in an utterance. (Lambrecht 1994: 207, underlined by the original author)

Unpredictability or non-recoverability (see also Kuno 1972) is also very similar to the definition in (16).

I use the term *assertion* in the sense of Stalnaker (2004). He argues that, among possible worlds, a single world is chosen by the assertion. I consider this to be equivalent to "being news to the hearer." The reason why I do not simply say "focus is the element being asserted" is that to single out a world from many possible worlds might be confused with contrastiveness. As will be discussed in §2.3.3, focushood and contrastiveness are similar but different notions.

As has been pointed out in many studies (e.g., Matsushita 1928; Chomsky 1965; Gundel 1974), the answer corresponding to a *wh*-question is a typical focus. The following examples are from Lambrecht (1994: 121). The interpretation of information structure is by the author and might slightly differ from Lambrecht's original intention.

	- Q: What did the children do next?
	- A: [The children] [went to school.]
	- Q: Who went to school?
	- A: [The children] [went to school.]

### 2 Background

	- Q: What happened?
	- A: [The children went to school.]

Focus is news (or newsworthy in Mithun 1995) for the hearer and can be repeated as what s/he learned from the current utterance. For example, in (22), the topic *John* in (22-A) cannot be repeated as news by B, whereas (part of) the focus *teacher* can be repeated by B′ .

(22) A: [{As for/Regarding} John] , [he] [is a teacher]. B: ??Aha, **John**. B ′ : Aha, **a teacher**.

*No* tests based on Erteschik-Shir (2007) are also available. See discussion in §3.3.2. The identfication of focus using *wh*-question-answer pairs, such as ((19)–(21)), or the *aha* test (22) rests on the assumption that foci are news or newsworthy, while *no* tests like (12) in §3.3.2 are based on the assumption that foci can be controversial.

In the following sections, I review various notions associated with foci and how they relate to the discussion of foci in the present work.

### **2.3.2 Newness**

Newness is known to correlate with focushood (Li 1976; Givón 1983; Halliday 2004: *inter alia*). Although different researchers use the term *new* to refer to different concepts, I use this term to indicate strictly "new" in terms of Prince (1981) or "what the speaker assumes he is introducing into the addressee's consciousness by what he says" (Chafe 1976: 30). Other newness, what is called "relational new" in Gundel (1988), is excluded from the current discussion. According to Gundel & Fretheim (2006: 177), relational newness is described as follows.

(23) Y [focus] is new in relation to X [topic] in the sense that it is new information that is asserted, questioned, etc. about X. Relational [...] newness thus reflects how the informational content of a particular event or state of affairs expressed by a sentence is represented and how its truth value is to be assessed.

The notion of "relational new" corresponds to focus in this study and the notion of comment in Gundel (1988).

The literature agrees that not all foci are new. As discussed in §2.2.3, focus can be an evoked element. (12), repeated here as (24), is an example of this case;

### 2.3 Focus

*himself* in (24-B) is evoked because the referent "Felix" has already been mentioned in the preceding utterance (24-A), and, at the same time, it serves as focus because it corresponds to the answer part of the *wh*-question in (24-A).

(24) A: Who did Felix praise? B: [Felix praised] [himself.]

(Reinhart 1981: 72, style modified by NN)

On the other hand, all new elements can be foci. It is well known that, in English, (specific or non-generic) indefinite noun phrases cannot be topics. For example, Gundel (1974), discussing the following examples, concludes that indefinite noun phrases cannot be topics. As shown in (25-a) and (26-a), indefinite noun phrases cannot be put in the frame *concerning* and *about*; nor can they appear in the frame *what about*.


b. \*What about a lion? – Bill shot him. (*ibid.*)

I argue that new elements that have been known to the hearer before the utterance, i.e., "unused" in terms of Prince (1981), can be either topics or foci. They are new in the sense that the speaker is introducing them into the hearer's consciousness by what s/he says; but they are given in the sense that they are assumed by the speaker to be shared with the hearer. In Chapter 5, I argue that, in fact, unused elements have characteristics of both topics and foci.

### **2.3.3 Contrastiveness**

Many studies, particularly in generative linguistics, associate focushood with contrastiveness (frequently accompanied by a pitch peak). Here I base my discussion on Rooth (1985; 1992), who was inspired by von Stechow (1991), since his theory is one of the most influential studies on focus as contrastiveness.

In his theory, alternative semantics, where focus is related to the intuitive notion of contrast, Rooth argues that the function of focus is to evoke alternatives; in other words, the focus element is contrasted with the alternatives. For example, consider (27) in two cases, one in which *Mary* is focused and one in which *Sue* is focused.

### 2 Background

### (27) Mary likes Sue.

The former case evokes the set of propositions of the form 'x likes Sue', as formalized in (28-a), whereas the latter case evokes the set of propositions of the form 'Mary likes y', as formalized in (28-b).

	- b. <sup>J</sup>[ Mary likes [Sue] ]K = {**like**(**m**,y) ∣ y ∈ }

(Rooth 1992: 76)

Among the members of these sets, Mary is chosen as the one who likes Sue in (28-a), and Sue is chosen as the one who Mary likes in (28-b).

The characterization and formalization of focus in alternative semantics is clear and seems to work well. However, characterizing foci as contrastive is problematic for our assumptions: we have assumed that topic and focus are mutually exclusive, and yet there can be contrastive topics and contrastive foci, as has been pointed out in Vallduví & Vilkuna (1998). Especially problematic for us is the existence of contrastive topics. If contrastiveness is equal to focushood, one has to admit that a contrastive topic is both topic and focus. Following Vallduví & Vilkuna (1998), I argue that this is very confusing for a theory of information structure and it is more plausible to assume that contrastiveness is a feature independent of both topichood and focushood. For example, as will be discussed in Chapter 4, the particle*wa* in Japanese is sensitive to some properties of topichood, whereas the particle *ga* is sensitive to some properties of focushood. In addition to this, these two particles are also sensitive to contrastiveness: they are obligatory when contrast is involved but are optional in other cases. Still, contrastive *wa* and *ga* are sensitive to topichood and focushood, respectively. Therefore, this study assumes that contrastiveness is independent of topic and focus. However, it is highly likely that other languages work differently. Further study is needed to investigate whether contrastiveness is independent of topic and focus in all languages.

### **2.3.4 Pitch peak**

Some studies assume that focus involves a pitch peak. For example, (Chomsky 1972: 100) states that "phrases that contain the intonation center [pitch peak in the present work] may be interpreted as focus of utterance". As Gundel (1988: 230) reports, the association between pitch peak and focus is found in typologically, genetically, and geographically diverse languages and concludes that this

2.4 Characteristics of Japanese

association seems to be universal. According to her, a focus is given a pitch peak at least in English, Guarani, Russian, and Turkish with the only exception of Hixkaryana (see also the references in her work and Büring 2007).<sup>3</sup>

As has been pointed out in previous studies on other languages (e.g., Jackendoff 1972: §6.2), however, I do not employ the definition of focus as a pitch peak because the goal of this study is to investigate the association between information structure and linguistic forms including intonation; the definition of focus as a pitch peak spoils the goal of our study. Moreover, I will argue in Chapter 6 that elements other than focus are given a pitch peak. For example, a topic that is reintroduced in the discourse is produced prominently (see also Gundel 1999). It is also well known that contrastiveness correlates with pitch peak. Therefore, regarding focus as an element with pitch peak causes great confusion.

### **2.4 Characteristics of Japanese**

In this section, I provide a rough overview of the typological characteristics of Japanese. Most of the literature on Japanese is based on written language; therefore, most of this section is also based on written Japanese – except for the parts that have to do with sound, such as intonation. I discuss differences between written and spoken Japanese where necessary.

### **2.4.1 General characteristics**

Japanese is an SOV language, with typical OV characteristics in terms of Dryer (2007): it has postpositions (which are called particles in this study), genitives precede nouns, adverbial subordinators appear after the verb, main verbs precede auxiliary verbs, question particles and complementizers appear after the verb, subordinate clauses precede main clauses, and relative clauses precede nouns (Shibatani 1990; Masuoka & Takubo 1992). Moreover, nouns are preceded by adjectives and demonstratives, and verbs are followed by many kinds of suffixes indicating tense, modality, negation, passive voice, causativity, and so on. (29) shows some examples of Japanese sentences. "A" stands for the agent-like argument of transitive clauses; "S" stands for the only argument of intransitive clauses; and "P" stands for the patient-like argument of transitive clauses.

(29) a. taroo-ga Taro-nom hanako-ni Hanako-dat hon-o book-acc yat-ta give-past 'Taro gave a book to Hanako.' (A + DAT + P + V)

<sup>3</sup> See Downing (2012) for more exceptions.

### 2 Background


(Shibatani 1990: 257–258, glosses modified by NN)

The features of Japanese most relevant for this study are the order of the subject, object, and the verb and the order of nouns and particles. Also, as will be discussed in 2.4.3, arguments such as subjects and objects can be 'scrambled', i.e., word orders other than the basic word order are found in both spoken and written Japanese.

In written Japanese, the particles *ga* and *o*, which follow nouns, are considered to be a nominative particle and an accusative particle respectively, and Shibatani glossed them as such. As will be discussed below, however, zero particles are extensively used in spoken Japanese and the characterization of *ga* as the nominative marker and *o* as the accusative marker does not necessarily reflect the exact properties of these particles. Since the literature is mainly based on written Japanese, I keep the glosses of nom for *ga* and acc for *o* in this chapter. In the same way, I will use top for *wa* since most of the literature agrees that *wa* is a topic marker (no matter what it means), although, again, the zero particle is extensively used in the spoken language. However, the reader should keep in mind that the glosses in this chapter are tentative. I will not use nom acc, and top in the following chapters; instead, I will just gloss *ga*, *o*, and *wa* for each particle.

Japanese extensively employs so-called zero pronouns. In (30), for example, pronouns such as 'I', 'him', and 'it' are not explicitly uttered.

(30) a. zyon-ga John-nom ki-ta-node, come-past-since ai-ni meet-dat it-ta go-past 'Since John came, (I) went to see (him),'

2.4 Characteristics of Japanese

b. zyon-ga John-nom dekire-ba can-if suru-desyoo do-will 'If John can (do it), (he) will do (it).' (Kuno 1973b: 17)

These omitted pronouns are sensitive to the information status of the referents (see Kuno 1978: Chapter 1).

The language has five vowels and 15 consonants (although the number may vary depending on the analysis). The syllable structure is relatively simple: a syllable basically consists of a consonant and a vowel, where long vowels, geminates, and final nasal codas are possible. Also, /y/ ([j]) can appear between a consonant and a vowel as in *kyoo* ([kjo:]) 'today' as opposed to *koo* ([ko:]) 'this way'. Pitch accent plays an important role in Japanese. The systems of pitch accent vary among dialects; here I review the accent system of Standard Japanese (spoken around Tokyo), which is the variety investigated in the present study. First, in Standard Japanese, the pitch is either high or low, and the pitches of the first and the second syllables are different. If the first syllable is high, the second syllable is low, and vice versa. Second, the accent nucleus (indicated by ^) specifies where the pitch falls. For example, [ha^Ci] 'chopsticks' indicates that [ha] is high and [Ci] is low. On the other hand, [haCi^] 'bridge' indicates that [ha] is low and [Ci] is high. Words without nucleus accent are also possible as in the case of [haCi] 'edge', which is pronounced in the same way as 'bridge'. The distinction between [haCi^] 'bridge' and [haCi] 'edge' can be made, for example, by examining the accentless particles following them. For example, when *ga* 'nom' follows [haCi^] 'bridge', the pitch of *ga* is low because the accent nucleus specifies where the pitch falls. On the other hand, when *ga* follows [haCi] 'edge', *ga* is produced in a high pitch. Thereby [haCi^] 'bridge' and [haCi] 'edge' can be distinguished from each other. In addition to phonemes and pitch accents, issues on intonation will be discussed in more detail in §2.4.4, since they are one of the main topics of this study.

### **2.4.2 Particles**

As mentioned above, nouns in Japanese are followed by various particles or postpositions. In general, they are believed to be clitics and indicate the status of a noun in a clause.<sup>4</sup> In this section, I review the literature on the particles that will be investigated in this study, namely *ga*, *o*, and *wa*. Note again that the literature is mainly on written Japanese. In §2.4.2.7, I present a review of the literature on

<sup>4</sup>Although the equal sign (=) is usually used for clitic boundaries, I use the hyphen (-) and do not distinguish clitics from affixes for the sake of simplicity.

### 2 Background

zero particles, which are widely used in spoken Japanese in place of *ga*, *o*, and *wa*.

### **2.4.2.1 Case particles vs. adverbial particles**

In the present study, I discuss two kinds of particles that attach to nouns: case particles and adverbial particles. Case particles such as *ga* and *o* code the grammatical relations of the nouns. For example, in (31), *ga*, which follows the noun *taroo*, codes nominative case, whereas *o*, which follows the noun *hon* 'book', codes accusative case.


Adverbial particles, on the other hand, sometimes follow and sometimes replace case particles and add additional meaning to the sentence. The adverbial particle discussed in this study is *wa*. <sup>5</sup> *Wa* can replace *ga* and *o* and turn the noun into a topic. It sometimes replaces and sometimes follows *ni* 'dat'. For example, each noun in (31) can be *wa*-marked in the following ways.

	- b. hon-**wa** book-top taroo-ga Taro-top hanako-ni Hanako-dat yat-ta give-past 'Regarding the book, Taro gave it to Hanako.'
	- c. hanako-(ni)-**wa** Hanako-(dat)-top taroo-ga Taro-top hon-o book-acc yat-ta give-past 'Regarding Hanako, Taro gave a book to her.'

There are complex interactions between *wa*-marking and word order (e.g., Kuroda 1979), which will be discussed in Chapter 5.

### **2.4.2.2** *Ga*

Almost all studies agree that *ga* in contemporary Japanese is a case marker that codes nominative case (e.g., Yamada 1936; Kuno 1973b; Tanaka 1977; Shibatani

<sup>5</sup>There are other adverbial particles such as *mo* 'also' and *dake* 'only', which also follow or replace case particles. As the glosses 'also' and 'only' suggest, they are translated as adverbs in English, which is why they are called "adverbial" particles.

2.4 Characteristics of Japanese

1990). *Ga* is also said to code the "subject" (e.g., Kuroda 1979: 164). In addition, it can code genitive case and the object (in terms of this study, P). I do not introduce these usages since they are irrelevant to the present work. See, for example, Ono (1975); Nishida (1977); Yasuda (1977); Kuno (1973b); Shibatani (2001).

Recent studies are more interested in the mapping between surface form (such as *ga* and *o*) and the semantic (or deep) structure of predicates. See Kondo (2003) for a survey of such studies.

**2.4.2.2.1 Exhaustive listing vs. neutral description** Kuno (1973b) distinguishes two types of *ga*: exhaustive listing and neutral description. In terms of the present study, exhaustive listing corresponds to argument focus (or narrow focus), while neutral description corresponds to part of predicate focus and sentence focus (or broad focus), although whether the latter *ga* codes focus or not is controversial as will be discussed below. Examples (33-a-b) are instances of exhaustive listing and neutral description, respectively.

(33) a. **Exhaustive listing** zyon-**ga** John-nom gakusei-desu student-cop.plt '(Of all the people under discussion) John (and only John) is a student.' 'It is John who is a student.' b. **Neutral description** ame-**ga** rain-nom hutte fall i-masu prog-plt 'It is raining.' (Kuno 1973b: 38)

Kuno, following Kuroda (1979), proposes that *ga* of neutral description can only code the subject (As and Ss in this study) of action verbs, existential verbs, and adjectives/nominal adjectives that represent changing states, whereas *ga* of exhaustive listing can attach to any kinds of nouns. This is not the topic of the present work, which does not examine the associations between information structure and predicate type, although this is a very important topic. See Masuoka (2000: Chapter 4), which extensively discusses this issue.

**2.4.2.2.2** *Ga* **as focus marker** Lastly but most importantly in the present work, *ga* is sometimes described as a focus marker. *Ga* of exhaustive listing in Kuno (1973b) corresponds to *ga* as a focus marker (Heycock 2008). *Ga* coding new (unpredictable) information (Kuno 1973a: Chapter 25) is also related to *ga* coding focus.

### 2 Background

Noda (1995) classifies *ga* of exhaustive-listing as focus markers, or *toritate* particles, while he argues that *ga* of neutral description is a case marker.<sup>6</sup> *Toritate* can be literally translated as 'taking up' and is intended to mean 'to make something remarkable'. *Toritate* particles are defined as particles that make part of a sentence or a phrase remarkable and emphasize that part (Miyata 1948: 178). *Toritate* particles include *mo* 'also', *sae* 'even', *dake* 'only', etc., which are in general classified into focus markers in other languages. Therefore, I conclude that *toritate* particles, including *ga* with exhaustive-listing readings, correspond to focus particles.<sup>7</sup>

Ono et al. (2000) go further and claim that *ga* in natural conversation does not code As and Ss; rather, they claim that "*ga* is well characterized as marking that its NP is to be construed as a participant in the state-of-affairs named by the predicate in pragmatically highly marked situations" (p. 65). In other words, "*ga* is found in pragmatically highly marked situations where there is something unpredictable about the relationship between the *ga*-marked NP and the predicate such that an explicit signalling of that relationship becomes interactionally or cognitively relevant" (ibid.). Although it is not perfectly clear what they mean by "pragmatically marked situations", part of what they mean is that *ga* functions as a focus marker, since they use *ga* coding new or unpredictable information as a piece of evidence that supports their claim. In (34-b), for example, *ga* codes the answer to the question 'what club (are you going to) join?' in (34-a).

	- b. handobooru-**ga** handball-nom ii-kana-toka good–q-hdg omotte think [...] '(It's) handball (I want to join), (I) think.'

(Ono et al. 2000: 70)

<sup>6</sup>Tokieda (1950/2005) classifies some uses of *ga* into "particles which represent limitation" (p. 188ff.), which are also close to focus markers.

<sup>7</sup>However, many researchers also classify the so-called topic marker *wa* into *toritate* particles; some of them only include contrastive *wa* (Okutsu 1974; 1986; Numata 1986), others include both contrastive and non-contrastive *wa* (Miyata 1948; Suzuki 1972; Teramura 1981; Noda 1995). Although I do not believe that *wa*, including contrastive *wa*, is a focus marker, the notions of focushood and contrastiveness are frequently confused, but should be discussed independently. Therefore, I regard *toritate* particles as the equivalent of focus markers in other languages.

### 2.4 Characteristics of Japanese

**2.4.2.2.3 Remaining issues** It is indeed the case that *ga* sometimes follows nouns that are in a case that is not the nominative, as shown in (35). (See Chapter 4 for detailed discussion.) In (35-a), *ga* follows the postposition *kara* 'from (abl)', meaning that the noun cannot be nominative. In a similar manner, *ga* follows *to* 'with (com)' in (35-b) and *made* 'til (lim)' in (35-c).<sup>8</sup>

	- b. kotira-wa this-top nihonsyu-**to**-**ga** sake-com-*ga* au-desyoo match-will 'This one goes well with sake.' (A review from *Tabelog*<sup>10</sup> )
	- c. ie-ni home-dat kaeru-**made**-**ga** return-lim-nom ensoku-desu excursion-cop.plt 'Until (you) arrive at home is the excursion. (Before you arrive at home, you are on the way of excursion.)' (Common warning by school teachers)<sup>11</sup>

As will be discussed in detail in Chapter 4, this type of *ga* codes focus rather than nominative case. However, it is too extreme to claim that no kind of *ga* codes the nominative. For example, it is never possible to replace *o* in (31) with *ga* no matter how much *hon* 'book' is focalized. It is clear that *ga* sometimes codes nominative case, sometimes codes focus, and sometimes codes both. Also, as will be outlined below, zero particles are extensively used in spoken Japanese. Therefore, the question is under what conditions *ga* codes focus, under what conditions it codes nominative, and when is *ga* used instead of the zero particles. Also, what motivates *ga* to code focus? This is not the place to discuss whether *ga* codes focus or nominative case. I discuss these issues in Chapter 4.

### **2.4.2.3** *O*

There are fewer studies on the particle *o* and, as far as I am aware, almost all studies agree that *o* is an accusative marker and that it codes the patient-like argument in transitive clauses (e.g., Yamada 1936; Shibatani 1990). There are two

<sup>8</sup> (35-b) is not acceptable for some people.

<sup>9</sup>Toriyama, Akira (1990) *Dragon Ball* 23, p. 149. Tokyo: Shueisha.

<sup>10</sup>http://tabelog.com/ehime/A3801/A380101/38006535/dtlrvwlst/2992604/, last accessed on 03/23/2015

<sup>11</sup>I found 32,700 websites using this expression with Google exact search (searched on 06/17/2015).

### 2 Background

non-canonical usages of the particle *o*: coding time and place of transferring (Yamada 1936).

**2.4.2.3.1 Remaining issues** Both of these non-canonical usages of *o* concern the mapping between surface forms and semantic structures, as discussed in the paragraph on *ga* and "object" marking. Therefore, I consider these issues to be independent of information structure.

As with *ga*, zero particles are extensively used instead of *o* in spoken Japanese. It is therefore necessary to investigate the distribution of zero particles and *o*. I propose conditions for the use of zero particles and *o* in Chapter 4. I will give an overview of the literature on the zero particles in §2.4.2.7.

### **2.4.2.4** *Wa*

The adverbial particle *wa* has been widely discussed in the literature because the conditions on where it appears are very complex and subtle.

In the early literature of modern Japanese linguistics, *wa* was confused with a nominative marker because most of the time the particle codes so-called nominative case in place of *ga*. According to Aoki (1992: 2), who studied more than 10,000 examples of *wa* in novels and essays, 76.7% of *wa* codes nominative case, and 84.7% of *wa*-marked nouns code nominative case. Moreover, *wa* appears to "replace" *ga*. For example, the sentences in (36-a) with *wa* and (36-b) with *ga* are truth-conditionally equivalent, and replacing one particle with the other does not affect the truth value of the sentence.


b. zyon-**ga** John-nom gakusei-desu student-cop.plt 'John is a student.' (Kuno 1973b: 38)

In the same way, (37-a) and (37-b) are truth-conditionally equivalent.

(37) a. ame-**wa** rain-top hutte fall i-masu-ga... prog-plt-though 'It is raining, but...' b. ame-**ga** rain-nom hutte fall i-masu prog-plt 'It is raining.' (ibid.)

### 2.4 Characteristics of Japanese

Therefore, *wa* was considered to code nominative case like *ga*.

Yamada (1936: 472ff.) pointed out that *wa* should be classified as an adverbial particle (*kakari joshi*) <sup>12</sup> and should not be confused with case particles such as *ga*. However, since *wa* codes nominative case most of the time, *wa* has been analyzed as opposed to *ga*. Since the nature of *wa* has been widely discussed, I can only give a simplified overview of representative analyses, each of which captures a certain aspect of the particle. Onoe (1977) is a useful survey of the history of studies on *wa*, and Noda (1996) is a good summary of contemporary studies. Here I focus on *wa*-marked nouns and put aside the other uses of the particle. For other types of *wa*, see, for example, Teramura (1991: Chapter 7).

The most popular analysis of *wa* is that it is a topic marker, which was proposed by Matsushita (1928). <sup>13</sup> However, the definition topic itself is controversial in the literature as we have seen in §2.2. So, the question of what a topic marker is still remains. In what follows, I outline various proposals in this regard found in the literature.

**2.4.2.4.1 Givenness** The first characterization of *wa* is that it codes given information (Chafe 1970: 233). Kuno (1973b) also makes a similar claim: *wa* codes anaphoric information, i.e., information that has been "entered into the registry of the present discourse" (45). According to Kuno (1973b), for example, (38-a) is unacceptable because *ame* 'rain' has not been entered into the present registry, whereas (38-b) is acceptable because *wa*-coded *ame* 'rain' has been registered. Note that the first-mentioned *ame* was coded by *ga* in (38-b).


The analysis that *wa* codes given information explains the fact that *wa* cannot attach to nouns such as *wh*-phrases like (39-a), quantified noun phrases like

<sup>12</sup>Yamada distinguishes *kakari joshi* from *fuku joshi*. Although the English term *adverbial particle* sounds closer to *fuku joshi*, I use the term *adverbial particle* to include both *kakari joshi* and *fuku joshi* because this distinction does not matter for now.

<sup>13</sup>According to Onoe (1977), this was first proposed in *Ayuishô* by Fujitani Nariakira (1778).

### 2 Background

(39-b), and indefinite pronouns like (39-c). They represent new information and have not been entered into the registry of temporary discourse.


Although I believe that Kuno's observation explains a condition of *wa*-coding well, his claim needs to be supported by more natural data because his grammatical judgements are not unanimously shared by all native speakers of Japanese. Moreover, as will be discussed in Chapter 4, 78 (41.1%) out of 190 cases of *wa* code new (non-anaphoric) information, i.e., nouns without antecedents in the previous contexts. Most of them are neither generic nor contrastive and need explanation. I will discuss the conditions of the use of *wa* in Chapter 4.

**2.4.2.4.2 Generic** *wa* Kuroda (1972) and Kuno (1973b) argue that generic nouns can be always marked by *wa*. <sup>14</sup> According to Kuno (1972), this is because they are "in the permanent registry of discourse, and do not have to be reentered into the temporary registry for each discourse" (p. 41). For example, the sentences in (40) are acceptable in an out-of-the-blue context.

	- 'Human beings die. (All humans are mortal.)' (Constructed)

In Chapter 4, however, I will show that not all generic nouns can be felicitously coded by this particle in an out-of-the-blue context. Instead, I propose that the generic condition of *wa*-coding is integrated into its the givenness condition.

<sup>14</sup>Kuroda (1972) pays more attention to generic events rather than just nouns.

### 2.4 Characteristics of Japanese

**2.4.2.4.3 Contrastive** *wa* Kuno (1973b) distinguishes between the *wa* coding given information (in his sense, anaphoric information) and the one coding contrastive information. He argues that contrastive *wa* can code new (in his term, "non-anaphoric") information as shown in the contrast between (41-a) and (41-b). According to Kuno, *oozei-no hito* 'many people' in (41-a) is new and non-contrastive; therefore, the sentence is not acceptable. On the other hand, *oozei-no hito* 'many people' in (41-b) is new and it contrasted with *omosiroi hito* 'interesting person'; in this case, the sentence is acceptable. Contrastive *wa* is typically accompanied by high pitch. Note that the examples and acceptability judgements are by Kuno, and that in particular (41-b) is not acceptable to some people (including the author).


The contrast between (42-a) and (42-b) is explained in the same way.


While some studies like Kuno (1973b) assume that contrastive non-contrastive *wa* are independent and mutually exclusive, others like Teramura (1991) speculate that they are governed by the same condition(s). Teramura (1991) claims that the basic property of the particle is to indicate contrast with other elements, and non-contrastive *wa* appears when the contrasted elements are not noticed.

Hara (2008) shows that contrastive *wa* always induces scalar implicatures as in (43-a) and proposes a formal analysis of the particle. Furthermore, Hara (2006)

### 2 Background

argues that these implicatures are conventional rather than conversational implicatures.


(Hara 2006: 36)

The present study does not aim at investigating detailed characteristics of contrastive *wa*; rather, I am more interested in capturing various aspects of *wa* as a whole, including its contrastive uses, and in giving a unified explanation for all of them. Therefore, issues like the syntactic position of contrastive *wa*, the interaction between contrast and negation or quantifiers, and their formal analyses are outside of the scope of this study. In Chapter 4, I will argue that contrastive and non-contrastive *wa* can be explained consistently with a single principle along the lines of Teramura (1991).

**2.4.2.4.4 Characterization of** *wa* **based on judgement types** Kuroda (1972), inspired by Branz Brentano and Anton Marty, proposed the distinction between *wa* vs. *ga* based on categorical vs. thetic judgements. According to Kuroda, "the categorical judgement is assumed to consist of two separate acts, one, the act of recognition of that which will be made the subject, and the other, the act of affirming or denying what is expressed by the predicate about the subject" (p. 154). On the other hand, the thetic judgement "represents simply the recognition or rejection of material of a judgement" (ibid.). Kuroda argues that sentences with *wa*, like (44-a), correspond to the categorical judgement, and those with *ga*, like (44-b), correspond to the thetic judgement.


### 2.4 Characteristics of Japanese

The categorical judgement roughly corresponds to the predicate-focus structure, and the thetic judgement corresponds to the sentence-focus structure.

I assume that some part of judgement types can be reduced to particles. Therefore, the theory of judgement types and particles are compatible and complement each other. In the present study, I only focus on the particles and leave the rest for future studies.

**2.4.2.4.5 Cohesion** Clancy & Downing (1987), analyzing spoken narratives, suggest that "*wa*-marking is not necessary to establish thematic status, nor does *wa*marking, when it appears, necessarily indicate that the participant in question is thematic, to the extent that thematicity can be equated with the measures that [they] have considered, i.e., the frequency of appearance, persistence, or ability to elicit zero switch reference" (p. 24), contrary to other studies such as Maynard (1980). They conclude that "the primary function of *wa* is to serve as a local cohesive device, linking textual elements of varying degrees of contrastivity" (p. 46) because "the majority of *wa* uses in [their] data, whether thematic or locally contrastive or both, occurred on switch subjects, i.e., references to participants who by definition had been non-subjects when last mentioned" (ibid.).

I investigated whether this generalization applies to my data, CSJ (*the Corpus of Spontaneous Japanese*), which also includes spoken narratives as will be explained in the next chapter. First, I extracted all *wa*-coded NPs and pronouns and their antecedent NPs and pronouns. Then, I categorized the antecedents into so-called subjects (*ga*-coded NPs), objects (*o*-coded NPs), and datives (*ni*-coded NPs) and counted their numbers. As a result, it turned out that 13 subjects, 11 objects, and 10 datives were the antecedents of *wa*-coded NPs or pronouns. Although the numbers are very small and it is inappropriate to generalize based on them, it is clear that Clancy and Downing's claim does not hold in my data.

Moreover, Watanabe (1989) argues, analyzing corpora, that *wa* codes important and definite nouns, contrary to Clancy & Downing (1987). Therefore, it is necessary to re-examine their claim.

**2.4.2.4.6 Isolation** It has been pointed out that *wa* isolates the nouns marked by it from the rest of the sentence. Onoe (1977) reports that this issue was observed in the 19th century in studies like *Colloquial Japanese* by Brown (1863) and *Japansche Spraakleer* by Hoffmann (1868). Onoe (1981: 103), supporting this view, argues that a sentence with *ga* as in (45-a) expresses a unified situation, whereas a sentence with *wa* as in (45-b) isolates or separates the noun from the predicate, in this case *sora* 'sky' from *aoi* 'blue', and then associates these two.

### 2 Background

(45) a. sora-**ga** sky-nom aoi blue 'The sky is blue.'

b. sora-**wa** sky-top aoi blue 'The sky is blue.'

He further argues that *wa* "drastically confirms the thetic judgement 'the sky is blue'" (ibid.).

While I believe that this characterization partly captures the nature of *wa*, it needs to be expressed within a theory and supported by more data.<sup>15</sup> For example, *ga* in (45-a) also separates *sora* from *aoi* because there is a phrase boundary. Where does the intuition of *wa*'s "isolation" come from? In Chapter 6, I argue that there is an intonation boundary between a topic and a focus; therefore, topics including *wa*-coded elements are intonationally separated from foci.

**2.4.2.4.7 Remaining issues** As I have mentioned above, the aim of this study is to give a consistent explanation of *wa*-coding, rather than a detailed model of some aspect of the particle. The characteristics summarized above reflect some of the aspects *wa*. Later on, I will propose conditions *wa* as a whole. As I also stated above, the properties of predicates and sentence types are outside the scope of this study. However, I believe that characterizing the particle *wa* will help us to understand other unexplained features in the future.

### **2.4.2.5** *Toiuno-wa*

In this section, I discuss the marker *toiuno-wa*, which will also be investigated in the present study. The marker consists of at least four morphemes, as shown in (46).

(46) to quot iu-no-wa call-one-*wa*

The first morpheme *to* is a quotation marker, and *iu* corresponds to 'call' (or, more closely, 'heißen' in German). (47) is an example of how *to* and *iu*, which are realized as *to ii*, are used.

<sup>15</sup>Onoe seems to think that the existence of the contrastive *wa* supports the particle's "isolation" function. However, the connection between isolation and contrastiveness is not clear to me.

2.4 Characteristics of Japanese

(47) hasi-wa chopstick-top tyuugoku-go-de China-language-in nan-**to** what-quot **ii**-masu-ka call-plt-q 'How do you call "chopsticks" in Chinese?' (Masuoka & Takubo 1992: p. 81)

The morpheme *no* is a nominalizer which corresponds to 'one' (as in *this one*) in English. It can be used when restrictively modified nouns are repeated or are clear from the context (p. 160).

(48) kono this seetaa-wa sweater-top tiisai-node small-because ookii-**no**-to big-one-with kaete exchange kudasai please 'Since this sweater is too small, please exchange this with a bigger one.' (op. cit.: p. 160)

Masuoka & Takubo (1992) point out that the combination of noun + *to iu* + *mono* ('thing') is used when the speaker is talking about the category in general, rather than a specific referent of the noun. For example, *kyoosi* 'teacher' in (49-a) simply refers to specific teachers, whereas *kyoosi* followed by *-to iu mono* in (49-b) refers to teachers in general.

(49) a. sotugyoo-paatii-ni-wa graduation-party-dat-top **kyoosi**-ga teacher-nom 20-mei 20-cl seito-ga student-nom 140-mei 140-cl syusseki attend si-ta do-past '20 teachers and 140 students participated in the graduation party.' (Specific teachers) b. **kyoosi-to** teacher-quot **iu** call **mono**-wa thing-top tuneni always aizyoo-o love-acc mot-te have-and seeto-o student-acc mitibika-nakere-ba lead-neg-cond nara-nai become-neg 'Teachers always must lead their students with love.' (Teachers in general)

(op. cit.: p. 34)

This also applies to *no*, which also refers to some category in general rather than a specific entity. In fact, *mono* in (49-b) can be replaced with *no* without changing the meaning. The morpheme *wa* is the particle discussed in the previous section.

Unless I am discussing the compositional meanings of *to iu no-wa*, I will put no spaces in *toiuno* because it is sometimes reduced to *(t)teno*, *t(y)uuno*, or even [tW:n@]. I separate *wa* to keep the relationships between *toiuno-wa* and *wa* trans-

### 2 Background

parent, although *wa* sometimes merges with *toiuno* and the sequence is realized as [tW:n@:], [t:Ena:], [tsW:na:], etc.

While other combinations such as *toiuno-ga* and *toiuno-o* are possible, I focus on *toiuno-wa* because other combinations are rare in the corpus. Since there are only a few studies on *toiuno-wa* itself, I also include studies on *toiu* (without *no-wa*) in the following overview.

**2.4.2.5.1 Basic usage** According to Takubo (1989), the combination of *toiu* and basic category nouns (such as *hito* 'person' and *mono* 'thing') is sometimes used to introduce proper names that the hearer is assumed not to know.

(50) kinoo yesterday tanaka Tanaka siroo-**toiu** Shiro-called **hito**-ni person-dat ai-masi-ta meet-plt-past 'Yesterday I met a person called Shiro Tanaka.' (Takubo 1989: p. 218)

Similarly, Nihongo Kijutsu Bumpô Kenkyû Kai (2009) describes *toiuno-wa* as "presenting an expression as a topic and explaining the meaning or attributing a noun to a specific referent" (p. 230). (51-a) exemplifies the former, and (51-b) exemplifies the latter.

	- b. satoo-san-**toiuno-wa** Sato-hon-*toiuno-wa* eigyoo-bu-no sales-section-gen satoo-san-desu-ka Sato-hon-cop-q zinzi-bu-no personnel-section-gen satoo-san-desu-ka Sato-hon-cop-q 'Which do you mean by "Mr.Sato", the person in the sales section or the person in the personnel section?' (Nihongo Kijutsu Bumpô Kenkyû Kai 2009: 230)

Sentences with *toiuno-wa* also express general properties of the topic or a judgement on what it should be. (52-a) is an example of the former, and (52-b) is an example of the latter.

(52) a. suzuki-**tteiuno-wa** Suzuki-*toiuno-wa* aaiu that.kind yatu-da-yo guy-cop-fp 'Suzuki is that kind of guy.'

2.4 Characteristics of Japanese

b. kagaku-**toiuno-wa** science-*toiuno-wa* honrai essentially heewa-no peace-gen tame-ni sake-for yakudateru-beki use-should mono-da thing-cop 'We should use science for the sake of peace.' (op.cit.: 231)

**2.4.2.5.2 Characterization of** *toiuno-wa* **based on predication types** Masuoka (2012), who was inspired by Sakuma (1941), analyzes the association between predication types and the marker *toiuno-wa* and concludes that *toiuno-wa* is a topic marker only for property predication (or individual-level predication), as opposed to event predication (or stage-level predication). Property predication states the property of a referent (Masuoka 1987; 2008a), which is unbounded by space or time. Masuoka states that property predication corresponds to the individual-level predication proposed in Carlson (1977). <sup>16</sup> (53) exemplifies property predication, which is true regardless of time and space and hence also unbound by time and space.

	- b. That person is kind.

(Masuoka 2008b: 4, translated by NN)

On the other hand, event predication describes an event bound by time and space as in (54).

(54) A child smiled. (op.cit.: 5)

This corresponds to stage-level predication in Carlson (1977).

To see that *toiuno-wa* is a marker of property predication only, compare the following examples. In (55-a), which expresses event predication bound by space and time, *toiuno-wa* cannot be used felicitously, while in (55-b), which expresses property predication unbound by space and time, *toiuno-wa* can be inserted.

(i) a. That person is busy.

b. My friend {has been to / went to} France many times.

(Masuoka 2008b: 5–6, translated by NN)

Masuoka states that they are atypical property predication. Anyway, I do not get involved in the issue of predicate types in the present study.

<sup>16</sup>However, property predication and individual-level predication are not exactly the same because according to Masuoka (2008b), the following examples are classified into property predication, which is typically considered to be stage-level rather than individual-level predication.

### 2 Background


**2.4.2.5.3 Remaining issues** Masuoka's characterization of *toiuno-wa* well captures some aspects of this marker. In the present work, I will discuss *toiuno-wa* from different perspectives and will not go into detail in what respects predication types. I also aim at describing the relationships among other topic markers such as *wa* and *kedo*/*ga*, which will be discussed below.

### **2.4.2.6** *Kedo* **and** *ga*

Sometimes conjunctions can be used as topic markers. The present study discusses *kedo* and *ga* preceded by a copula, both of which correspond to 'although' or 'whereas' in English. *Kedo* and *ga* are differ mainly in terms of register; *kedo* can be used in both casual and formal styles, whereas *ga* is mainly used in the formal style. *Ga* in (56-a) and *kedo* in (56-b), which are preceded by copulas, function as topic markers in the sense that they newly introduce topics at the beginning of a discourse or a paragraph, or are used to state different aspects of the current topic (Koide 1984; Takahashi 1999). Intuitively, 'that issue' in (56-a) and 'Yamada' in (56-b) are considered to be newly introduced.


Note that the so-called nominative *ga* is different from the conjunctive *ga* in various ways. For example, conjunctive *ga* does not directly follow nouns; rather, nouns must be followed by the copula (*desu*), as shown in (57-a). On the other hand, the so-called case marker *ga* can directly follow nouns, as shown in (57-b). 2.4 Characteristics of Japanese


Note also that *ga* and *kedo* as topic markers are different from conjunctive *ga* and *kedo*. Conjunctive *ga* and *kedo* by definition follow clauses instead of phrases; on the other hand, the corresponding topic markers cannot follow clauses. Since *kedo*- or *ga*-coded NPs like *rei-no ken* 'that issue' in (56-a) and *yamada-no koto* 'yamada's issue' in (56-b) appear to be the predicates of copular sentences, the subjects of these copular sentences should also be present. However, no subjects can be added in sentences like (56).

**2.4.2.6.1 Remaining issues** The characterization of *kedo* and *ga* as topic markers which introduce topics captures the distributions of these particles. In Chapter 4, I aim at capturing these markers as well as other topic particles from a unified point of view.

### **2.4.2.7 Zero particles**

While nouns in written Japanese are almost always followed by overt particles, zero particles (Ø) are ubiquitous in spoken Japanese. All kinds of core arguments (A, S, and P) can be basically coded by them, as exemplified in (58).


### 2 Background

Although I employ the symbol Ø and use expressions like "zero-coding" and "zero particles", I do not necessarily claim that zero particles exist. Rather, I see them as equivalent to "bare NPs" or "NPs not followed by any particle", and consider the difference a matter of notation. For the sake of clarity, however, I use the symbol Ø and refer to bare nouns as "zero-coding". Also, I do not get involved in the discussion of whether zero particles are in fact zero or are simply omitted. I assume that each production of a zero particle in everyday usage is governed by unique and complex conditions. When somebody says "the particle X can be replaced with Ø in this context," I consider it to mean "the conditions of producing X and Ø in this context are not predictable in the current model".

In this section, I review conditions of zero-coding that have been proposed in the literature. Note that other parts of §2.4.2 focus on written Japanese, while this part focuses on spoken Japanese. Shimojo (2006) and Fry (2001) are useful surveys of the previous literature and I rely on them to review the literature here.

**2.4.2.7.1 Socio-linguistic factors** Tsutsui (1984) points out that zero particles are acceptable in less formal situations. Also, it has been reported that zero particles are used differently in different dialects (e.g., Sasaki 2006; Nakagawa 2013). I discuss the zero particles in casual forms spoken around Tokyo to control for stylistic and dialectal differences.

**2.4.2.7.2 Word and sentence length** Tsutsui (1984: 98ff.) also proposes that zero particles following monosyllabic nouns are less natural than those following multisyllabic nouns. Fry (2001: 123) reports that 40% of multisyllabic words are zero-coded, while 27% of monosyllabic words are zero-coded.<sup>17</sup> Moreover, Jorden (1974: 44) has claimed that zero-coding is frequent especially in short sentences. Fry (2001: 122ff.), compared short utterances with less than 10 words with long utterances with equal to or more than 10 words, and found that zero particles appear more often in short utterances. Henceforth, I focus on overt vs. zero particles following multisyllabic NPs in short sentences to avoid this factor.

**2.4.2.7.3 Contrast and narrow focus** Contrasted elements are always followed by *wa* (Tsutsui 1984: 53ff.). In (59-a), for example, *boku* 'I' and *biru* 'Bill' are contrasted and cannot be followed by zero particles.

<sup>17</sup>However, his results are more complex; the difference between the zero-coding ratios of multisyllabic words and monosyllabic words are significant for As and Ss, but not for Ps.

2.4 Characteristics of Japanese

(59) a. boku-{**wa/\*Ø**} 1sg-{top/Ø} oyoi-da-kedo swim-past-though biru-{**wa/\*Ø**} Bill-{*wa/Ø*} oyoga-nakat-ta-yo swim-neg-past-fp

'I swam, but Bill didn't swim.'

b. boku-{**wa/Ø**} 1sg-{*wa/Ø*} biiru-{**wa/\*Ø**} beer-{*wa/Ø*} nomu-kedo drink-though sake-{*wa/\*Ø*} sake-{*wa/Ø*} noma-nai drink-neg 'I drink beer but not sake.' (Modified from Tsutsui 1984) 18

As Tsutsui (1984: 93ff.) also pointed out, zero particles cannot be felicitously used in narrow-focus contexts (the argument focus structure or "exclusivity" in Tsutsui's term). In these contexts, overt particles are obligatory (see also Fujii & Ono 2000). As shown in (60-B), where *suteeki* 'steak' is focused, for example, the overt particle *o* is natural, while the zero particle Ø is not.


In a similar manner, *hon* 'book' in (61-B) can be naturally followed by *ga*, but not by Ø, because *hon* is narrow-focused.


Based on these facts, Shimojo (2006), following Lee (2002), proposes that the function of zero particles is to "withhold[...] reference to other referents which are potentially related to the proposition denoted by the sentence" (p. 131).

On the other hand, Matsuda (1996) and Fry (2001)report that *wh*-word Ps (such as *nani* 'what' and *dare* 'who') are more likely to be zero-coded than non-*wh*word Ps. Fry found that 71% of *wh*-Ps are zero-coded, whereas 51% of non-*wh*-Ps are zero-coded. As exemplified in (62), zero-coded *wh*-Ps are not rare.<sup>19</sup>

<sup>18</sup>Many of Tsutsui's examples employ formal and polite forms rather than casual forms. Therefore, I modified all of his examples cited in the present study into casual forms to exclude the effect of formality.

<sup>19</sup>However, I did not find any examples of *dare* as P in *the Chiba three-party conversation corpus*.

### 2 Background


The fact that *wh*-words are more likely to be zero-coded than non-*wh*-words contradicts Tsutsui's observation because, in general, *wh*-questions are considered to be in narrow focus. Similarly, Niwa (2006: Chapter 10) reports that objects corresponding to the answer to a *wh*-question are acceptable, which are also considered to be in narrow focus and are therefore another counter-example to Tsutsui's claim. As shown (63-A), the object *kootya* 'tea', which is the answer to a *wh*-question, can be coded by either *o* or Ø.


To complicate matters, *wh*-subjects can be zero-coded, but subjects corresponding to the answer to a *wh*-question cannot (Niwa 2006). As exemplified in (64), the *wh*-subject *dare* 'who' can be either zero-coded or *ga*-coded, but the subject corresponding to the answer cannot be felicitously zero-coded.


Fry (2003) reports that the ratio of zero particles coding *wh*-words for As and Ss (25%) is lower than the ratio of zero-coding for non-*wh*-As and Ss (32%), although the difference is not significant in a 2 -test.

**2.4.2.7.4 Word order** Tsutsui (1984: 108ff.) argues that zero particles can be used naturally "if the NP [...] is preceded by the subject of the sentence and immediately followed by the predicate" (p. 108). As instantiated in (65), Tsutsui claims

### 2.4 Characteristics of Japanese

that the zero-coded NP *eigo* 'English' in (65-a) is natural because it is preceded by the subject *boku* 'I' and immediately followed by the predicate *umai* 'good', while the zero-coding in (65-b) is unnatural because it is not immediately followed by the predicate.

	- 1sg-{*wa/Ø*} English-{*ga/Ø*} Hanako-than good-fp 'I'm better at English than Hanako.' (Tsutsui 1984: 110)

This is supported by Matsuda (1996) and Fry (2001). Fry (2001: 124), for example, found that 58% of verb-adjacent Ps are zero-coded, whereas 41% of non-verbadjacent Ps are zero-coded.

Niwa (2006: 291ff.) points out that verb-adjacent NPs can be zero-coded more naturally when the NPs are non-topics (foci).<sup>20</sup> On the other hand, Niwa also found that clause-initial NPs can be naturally zero-coded when the NPs are topics. Compare (66) and (67). *Sugoi kawaii ko* 'very cute girl' in (66) is in focus because the NP is indefinite and is treated as news. In this case, the verb-adjacent NP can be felicitously zero-coded as in (66-a), whereas the non-verb-adjacent NP cannot naturally be zero-coded (66-b).

```
(66) a. oi
       hey
           keiri-ka-ni
           accounting-section-dat
                                     sugoi
                                     very
                                           kawaii
                                           cute
                                                   ko-{ga/Ø}
                                                   girl-{ga/Ø}
       hait-ta-zo
       enter-past-fp
       'Hey, a very cute girl joined the accounting section.'
  b. oi
       hey
           sugoi
           very
                  kawaii
                  cute
                          ko-{ga/?Ø}
                          girl-{ga/Ø}
                                     keiri-ka-ni
                                     accounting-section-dat
       hait-ta-zo
       enter-past-fp
       'Hey, a very cute girl joined the accounting section.' (Niwa 2006:
       293)
```
On the contrary, *ano ko* 'that girl' in (67) is topical because the NP is definite

<sup>20</sup>There may be elements in a sentence that are neither topics nor foci. The present study, however, assumes that all core arguments are either topics or foci; therefore, if an element is not a topic, it is assumed that it is a focus.

### 2 Background

and the participants have previously discussed her. In this case, both the verbadjacent and the non-verb-adjacent NPs can felicitously be zero-coded.

	- a. oi hey keiri-ka-ni accounting-section-dat **ano** that **ko**-{**ga/Ø**} girl-{*ga/Ø*} hait-ta-zo enter-past-fp 'Hey, that girl joined the accounting section.'
	- b. oi hey **ano** that **ko**-{**ga/Ø**} girl-{*ga/Ø*} keiri-ka-ni accounting-section-dat hait-ta-zo enter-past-fp 'Hey, that girl joined the accounting section.' (ibid.)

**2.4.2.7.5 Types of predicates** Tateishi (1989) argues that zero particles are natural only inside V′ . The subjects of a stage-level predicate or of an unaccusative predicate can be naturally zero-coded because they are realized inside V′ . On the other hand, the subjects of an individual-level predicate or an unergative predicate are realized outside V′ (see also Kageyama 1993: 56–57). As shown by the contrast between (68) and (69), the subjects of unaccusative predicates (68) can naturally be either zero- or *ga*-coded, while those of unergative predicates (69) can only be coded by *ga*; zero-coding results in anomaly.

```
(68) Unaccusative predicate
```

(69) Unergative predicate


Yatabe (1999) points out that there are counter-examples to Tateishi's generalization, citing an example from Niwa (1989). The predicate *happyoo suru* 'give a

2.4 Characteristics of Japanese

presentation' is an ergative predicate and it is possible to zero-code the agent of this action, as shown in (70).

(70) kondo next.time gengo-gakkai-de linguistic-conference-loc yamada-san-{**ga/Ø**} Yamada-hon-{*ga/Ø*} happyoo presentation suru-n-da-tte do-nmlz-cop-quot 'I heard that Mr. Yamada is going to give a presentation at the next linguistic conference.' (Niwa 1989: 49)

Note, however, that this example is topical zero-coding, rather than focal zerocoding, and these two might be different from each other.

Yatabe also argues against Tateishi's claim that zero particles cannot naturally follow the subject of an individual-level predicate. Although I do not get involved in this discussion because it is outside the scope of the present study, I suggest that this is also attributable to the distinction between topic vs. focus zero particles.

**2.4.2.7.6 Types of nouns** The hierarchy of features proposed in Silverstein (1976;1981) also plays a crucial role in zero-coding in spoken Japanese. Minashima (2001) reports that indefinite or inanimate objects are more likely to be zerocoded than definite or animate objects. The results in Fry (2001: 128ff.) support Minashima's generalization.<sup>21</sup> Kurumada & Jaeger (2013; 2015), by conducting experiments on speaker's choice between overt vs. zero particles, also report that speakers are more likely to attach the overt particle (*o*) to animate objects. On the other hand, Fry (2001: 128ff.) reports that "strongly definite" subjects (proper nouns and personal pronouns) are more likely to be zero-coded than other kinds of subjects. Also, animate subjects are more likely to be zero-coded than inanimate subjects. Fry points out that this tendency follows the typological generalization proposed in Comrie (1979; 1983).

Niwa (2006) suggests that the predictability of nouns influences the coding of particles. Compare (71-a) and (71-b), for example. The only difference between these two examples is what might fall from the sky; in (71-a), rain might fall, while, in (71-b), hail might fall, which is more surprising. In (71-a), both the overt particle *ga* and the zero particle are acceptable. By contrast, in (71-b) only the overt particle is acceptable.

<sup>21</sup>In Fry's data, zero-codings of animate and inanimate objects are not significantly different. He speculates that this might be because of the small number of animate objects in his corpus.

### 2 Background

(71) (The sky looks threatening.)


Kurumada & Jaeger (2013)<sup>22</sup> argue that

Japanese speakers prefer to produce an object NP without case marking when the grammatical function of a noun is made more predictable given the semantics of the noun (e.g., animacy) and the other linguistic elements in the sentence (e.g., plausibility of [grammatical-function]-assignment given the subject, object, and verb)

For example, doctors are more likely to do something to patients, rather than vice versa. Therefore, case in (72-a) is more predictable than in (72-b), meaning that *isya* in (72-b) is more likely to be overtly coded than *kanzya* in (72-a).


patient-nom doctor-{*o/Ø*} hospital.room-in wait-past 'The/a patient waited for the/a doctor in a hospital room.' (Translated from Kurumada & Jaeger (2013: 860))

They argue that their study "constitutes strong support for the view that language production is optimized to maximize the efficiency of information transmission", referring to Levy & Jaeger (2007) and Jaeger (2010).

**2.4.2.7.7 Other pragmatic factors** Makino & Tsutsui (1986) and Backhouse (1993) point out that NPs in interrogatives tend to be zero-coded. This is supported by Fry (2001), who studied a large corpus. For example, in (73) from the corpus of Fry (2001), *pen*, whose existence is in question, is zero-coded.

(73) nanka um kami-to paper-and pen-**Ø** pen-*Ø* aru? exist 'Um, do you have pen and paper?' (Fry 2001: 120)

<sup>22</sup>See also (p. 863 Kurumada & Jaeger 2015).

### 2.4 Characteristics of Japanese

Sentences of this type have attracted particular attention because the zero particle in this sentence is not optional; *wa* and *ga* (and, of course, *o*) cannot be used in this context. According to Onoe (1987), these obligatory zero particles typically appear in sentences like the following:


Also, Tsutsui (1984: 118ff.) observes that zero particles code information the hearer expects to hear. As shown in the contrast between (75) and (76), the zero particle (as well as *ga* in this case) can naturally code *basu* 'bus' in (75) if the speaker and the hearer are waiting for a bus and hence the hearer expects to hear the word *basu* 'bus'; on the other hand, zero-coded *basu* in (76) is unnatural because the hearer does not expect to hear *basu*.


Some researchers argue that discourse structure affects the selection of *wa* vs. Ø. Analyzing casual interviews, Suzuki (1995) claims that "relatively speaking, zero-marked phrases tend to represent minor [discourse] boundaries in contrast to major boundaries represented by *wa*-phrases" (p. 615). On the other hand, Kurosaki (2003), investigating scenarios of TV dramas, argues that zero particles are employed to introduce new topics (see also Niwa 2006), which implies

### 2 Background

that they appear at major discourse boundaries. For now, I suppose that it is extremely difficult to identify discourse boundaries in a reliable way, let alone the difference between major and minor boundaries. Therefore, we need to wait for breakthroughs in this area.

**2.4.2.7.8 Remaining issues** As we can see from the outline of studies on zero particles, factors that affect zero- vs. overt-codings are complex, and some results are contradictory. A theory that explains zero-coding is necessary. I propose a unified theory that predicts zero-coding in terms of information structure based on Nakagawa (2013). Along the lines of Comrie (1979;1983), I propose a frequency account of zero vs. overt coding of particles. I believe that this account is congruent with the theory proposed in Levy & Jaeger (2007); Kurumada & Jaeger (2013) and Kurumada & Jaeger (2015).

### **2.4.3 Word order**

While the basic word order in Japanese is APV (or SOV in more popular terminology), other variants are also possible. Example (77-a) shows the basic word order, and examples (77-b–f) show other possibilities. According to Shibatani (1990: 260), not all possibilities are equally natural in out-of-the-blue contexts, as shown by '?' before the sentence.


### 2.4 Characteristics of Japanese

In spoken Japanese, NPs (and adverbs) sometimes appear post-predicatively as exemplified in (78-b).


Different theories are interested in different aspects of word order phenomena in Japanese. As far as I can see, generative linguists and psycholinguists are mainly interested in 'scrambling': word order variations of the subject, the object, the dative, and possibly other arguments, all of which appear before the predicate. More recently, generative linguists have also been interested in the 'left periphery', which is tightly connected with information structure. Some construction grammarians study dative-alternation-like phenomena in Japanese.<sup>23</sup> Functional linguists and, more recently, interactional linguists have been interested in post-predicate constructions, partially because they are mainly working on spoken language, and post-predicate constructions in Japanese only appear in spoken language. On the other hand, traditional Japanese linguists have not discussed the word order phenomena that I am interested in (except for Noda 1983). Instead of word order variations, they concentrate on affix ordering and dependency relations (see e.g., Saeki 1998).

I outline previous studies on basic word order and other word order variation in the following sections. Note that different approaches are skewed to different sections for the reasons stated above.

### **2.4.3.1 Basic word order**

As far as I can tell, all Japanese linguists agree that the basic word order in Japanese is SOV (APV in terms of this study). For example, Shibatani (1990) states that "Japanese is an 'ideal' SOV (Subject-Object-Verb) language in the sense that

<sup>23</sup>I do not discuss the dative alternation in this study. See Nakamoto et al. (2006), who found that the choice between DAT+P+V and P+DAT+V is determined by the meaning of a sentence as a whole. More specifically, they showed that P+DAT+V is preferred for caused motion. On the other hand, their results also show that "there is an overall tendency for Japanese speakers to prefer [DAT+P+V] order to [P+DAT+V]" (p. 1). They argue that "the strength of the preference is not constant among different supralexical meanings " (ibid.).

### 2 Background

the word order of 'dependent-head' is consistently maintained with regard to all types of constituent" (p. 257).


Chujo (1983) conducted a sentence-comprehension experiment and reports that it takes longer to judge the grammaticality of the PAV order than that of the APV order.<sup>24</sup> It has also been confirmed that the PAV order is more difficult to process than the basic APV order in other experiments such as phrase-byphrase reading tasks (Miyamoto & Takahashi 2001), eye-movement experiments (Mazuka et al. 2001), and ERP experiments (Ueno & Kluender 2003).

In my data from *the Corpus of Spontaneous Japanese*, which will be explained in the next chapter, 39 examples appear in APV order, whereas 9 examples appear in PAV order. Therefore, APV is the basic (most frequent) word order in the corpus.<sup>25</sup> Note, however, that these numbers are very small compared to examples where a single full NP appears in a clause; 644 examples appear in the SV order, 336 examples appear in the PV order (without A), and 526 examples appear in the DAT + V order.<sup>26</sup> That clauses with two or more full NPs within the same clause are infrequent has already been reported for Japanese (Matsumoto 2003) and for other languages (Du Bois 1987; Dryer 1997), and the observation is also supported in my data.

<sup>24</sup>There is one exceptional case: if P is human and is not followed by the particle *o*, the time difference between APV and PAV disappears.

<sup>25</sup>Other non-verb-final orders such as VAP or AVP are extremely rare.

<sup>26</sup>However, the AV pattern appears only in 8 examples.

### 2.4 Characteristics of Japanese

### **2.4.3.2 Clause-initial elements**

Although clause-initial NPs can also be called "preposed" or "scrambled" NPs, I call them clause-initial because terms like "preposing" and "scrambling" assume movement of the NPs. Some even call all clause-initial NPs "topicalized" NPs, a term that I do not employ either because it already attributes a special function to the NPs in question. On the other hand, the term "clause-initial" does not assume movement or any other special function of clause-initial NPs.

**2.4.3.2.1 Topic** Functional linguists and recent generative grammarians who are working on cartography agree that topic-like NPs appear clause initially. As has traditionally been pointed out, topics, which correlate with given information, tend to appear clause-initially (Mathesius 1928; Firbas 1964; Daneš 1970; Kuno 1978). These topics function as "anchors" that associate previous and upcoming utterances. Generative grammarians (e.g., Endo 2014) assume the universal hierarchy in (80) proposed by Rizzi (2004) and argue that Japanese also follows this hierarchy. In generative grammar, it is assumed that a language (structure) is uniform unless there is strong counter-evidence for it (the Uniformity Principle: Chomsky 2001: 2).

(80) Force Top\* Int Top\* Focus Mod\* Top\* Fin IP (Rizzi 2004: 242)

"Force" stands for clause types such as declarative, interrogative, and imperative; "Top" for topic, "Int" for higher *wh*-elements (Rizzi 2001), "Mod" for modifiers such as adverbs, and "Fin" for finiteness.

Ferreira & Yoshita (2003) conducted a production experiment and found that Japanese speakers produced given arguments before new arguments, especially "when the previous mention of the given argument involved the same lexical content" (p. 688). Imamura (2017) employed *the Balanced Corpus of Contemporary Written Japanese* (BCCWJ) and concluded that "the direct objects in OSV [noncanonical "scrambled" word order] and *wa*-marked entities are generally given information. Yet, word order changes from SOV [canonical word order] to OSV do not influence the cataphoric prominence of a referent" (p. 78).

**2.4.3.2.2 Weight** Another important factor that affects word order is the weight of the NP. Yamashita & Chang (2001) pointed out that in Japanese heavy NPs tend to precede light NPs, whereas in SVO languages like English light NPs precede heavy NPs (e.g., Arnold et al. 2000). They also report that topics and subjects tend to precede other NPs, and that the weight and topichood of an NP compete to decide the order of the NPs (see also Kondo & Yamashita 2008).

### 2 Background

**2.4.3.2.3 Remaining issues** The previous literature agrees that topics, correlating with given information, appear clause-initially. This is also motivated from a cognitive perspective. The results of Chapter 5, however, show that not all given elements appear clause-initially. Moreover, there are post-predicate elements which correspond to topics in Japanese. It is therefore also necessary to explain why some topics appear after the predicate. In Chapter 5, I will show that sharedness, rather than givenness in general, affects word order in Japanese, and that activation status determines whether NPs appear clause-initially or post-predicatively. Also, whether the referent in question is mentioned in the following discourse or not affects word order in addition to the effect of particles, contrary to the finding of Imamura (2017).

### **2.4.3.3 Post-predicate elements**

I call NPs that appear after the predicate "post-predicate" or "postposed" NPs. As has been stated earlier, they appear mainly in the spoken language. While adverbs and noun-modifying phrases are also postposed frequently in conversation, the present study only discusses postposed NPs, which are exemplified in (81).


**2.4.3.3.1 Afterthoughts** Some researchers consider postposed elements to be "afterthoughts" (Shibatani 1990: 259): a clarification for an omitted element. Kuno (1978); Hinds (1982); and Ono & Suzuki (1992) also make a similar point. However, it has been pointed out that some postposed elements are produced in a coherent intonation contour without pause (Ono & Suzuki (1992: 436); Ono (2007: §2)), which suggests the possibility that the speaker does not have time to plan to produce the postposed part; rather, the postposed part has been planned as such.

**2.4.3.3.2 Non-focus** Takami (1995b), modifying Kuno (1978), proposes that NPs that are postposed are not foci. When focus NPs are postposed, the sentences are not acceptable, as shown in (82), where the *wh*-word *nani* 'what' in (82-a) and *mizu* 'water' in (82-b) are considered to be foci.

2.4 Characteristics of Japanese


Takami (1995a) argues that the NPs in the following examples can be postposed because they are not the most important information, although they are part of the focus.


I suppose that Takami's "important information" is equal to focus. In (83), part of the focus is postposed, but it is not "the most focalized part"; so the sentences in (83) are acceptable. Therefore, Takami's generalization that foci (or the most focalized part) cannot be postposed still holds.

Fujii (1991) argues that pragmatically important parts (such as focus and contrast) are uttered first, which results in postposed constructions. I consider this argument to be similar to Takami's argument and include Fujii in this section of postposed elements as non-focus.

**2.4.3.3.3 Emphasis** Hinds (1982) argues that some postposed elements add emphasis to the utterance. Ono & Suzuki (1992: 437) also highlight postposed elements that "strengthen the speaker's stance toward the proposition."

Although it is not clear how to identify "emphasis", their argument is important at least in two ways. First, when the postposed elements are produced in a coherent contour with the predicate, they are similar to final particles such as *ne* and *yo*. For example, in (84), the postposed element *watasi* 'I' follows the final particle *yo*.

(84) sukii ski itte go ki-masi-ta-**yo** come-plt-past-fp **watasi** 1sg '(I) went skiing, me.' (Ono & Suzuki 1992: 438)

### 2 Background

Given that final particles can appear in a row (e.g., *oisii yo ne* 'good, isn't it?'), it is no wonder that postposed elements behave as final particles, adding some kind of speaker attitude toward the proposition.

Second, as Ono & Suzuki (1992) pointed out, the implicatures of some postposed constructions are dramatically different from the corresponding pre-predicate constructions. For example, compare (85-a) and (85-b), which are composed of exactly the same elements and only differ in their word order. In (85-a), *sore* 'that' is postposed; in (85-b), *sore* is in the basic position. Therefore, they are expected to convey exactly the same meaning. However, (85-a) is not a simple question; rather it is closer to a rhetorical question implying that the speaker doesn't like *sore*. On the other hand, (85-b) is a simple neutral question.


Based on the evidence discussed above, Ono (2007) claims that the postposed construction has already been grammaticalized and is part of Japanese grammar.

**2.4.3.3.4 Activation cost** Nakagawa et al. (2008) divided postposed NPs into two types based on intonation, following Ono & Suzuki (1992): postposed elements uttered in the same intonation contour as the predicate (single-contour type) and the ones uttered separately from the predicate (double-contour type). They measured the Referential Distance (RD) between the postposed element in question and and its immediate antecedent by inter-pausal unit. The RD approximates the activation cost of the referent. A smaller RD indicates that the referent has been mentioned relatively recently and hence the activation cost is low; a larger RD indicates that it has been mentioned less recently and hence the activation cost is high.

Nakagawa et al. found that the RD of the single-contour type is much smaller than that of the double-contour type. They argue that the activation cost of the single-contour type is small and the referent is discussed currently as a topic. On the other hand, they report that the double-contour type is affected by multiple factors.

### 2.4 Characteristics of Japanese

**2.4.3.3.5 Preferred interactional structure** Tanaka (2005) argues that interactional factors affect word order in Japanese conversation. In sequences of conversation, there are preferred and dispreferred organizations (Schegloff et al. 1977; Heritage 1984; Pomerants 1984). Preferred organizations are, for example, an assessment followed by agreement and a request followed by acceptance. On the other hand, dispreferred organizations include an assessment followed by disagreement and a request followed by refusal. Preferred second parts – such as agreement following an assessment or acceptance following a request – are simple, direct, and are uttered without delay. On the other hand, dispreferred second parts – such as disagreement following an assessment and refusal following a request – are complex, indirect, and are uttered with delay. Levinson (1983: 332ff.) compares preferred vs. dispreferred organizations to unmarkedness vs. markedness in morphology.

Based on this, Tanaka (2005) found that preferred second parts begin with the predicate, followed by NPs and other adverbs and adverbial clauses, while dispreferred second parts end with the predicate, preceded by NPs and other elements. Tanaka argues that this contrast is observed because it is the predicate that expresses the conclusion, i.e. the agreement, disagreement, acceptance, or refusal.

Let us take a closer look at an example of an assessment-agreement sequence. In (86), Chikako (C), Keiko (K), and Emiko (E) are talking about current fashion trends which have been revived from their youth. First, Chikako comments that current fashion is exactly the same as the fashion trends of their youth. Then Keiko immediately agrees with Chikako by uttering the predicate followed by an NP. Note that the sign "=" indicates that there is no pause between utterances.


On the other hand, in the next example – a dispreferred second part – the speaker delays the predicate expressing refusal by putting several NPs and adverbs before

### 2 Background

the predicate. In the context preceding the second part in (87),<sup>27</sup> the speaker was asked about the content of an advertisement in a magazine.


The speaker could have simply said "we have no knowledge of (it)" because all other NPs are clear from the context. However, the speaker chose to utter the NPs (and adverbs) instead of omitting them presumably to delay the conclusion.

**2.4.3.3.6 Remaining issues** Postposed constructions have been well studied in various theories. However, few studies examine the difference between postposed NPs and other NPs such as clause-initial and pre-predicate NPs. Tanaka (2005) does not explain why speakers sometimes produce post-predicate elements and sometimes not. In Chapter 5, I will investigate these three kinds of NP in terms of information structure, especially activation cost. Also, I will discuss the possible *raison d'être* of post-predicate elements.

### **2.4.3.4 Pre-predicate elements**

I call NPs that appear immediately before the predicate pre-predicate elements. The previous discussion of basic word order in Japanese implied that Ps most frequently appear pre-predicatively and that this is the basic order. Following almost all theories, I assume that that Ps appear pre-predicatively in the basic

<sup>27</sup>I modified the transcription symbol "- (hyphen)" to "~ (tilde)" because hyphens are used to express morphological boundaries in this study. The tilde (originally, a hyphen) indicates a sudden stop of an utterance (typically a word) on the way to utter it. I will not explain other transcription symbols here because they are irrelevant to the current discussion. For more detail on transcription symbols, see Jefferson (2004) and Hepburn & Bolden (2013).

2.4 Characteristics of Japanese

word order and I provide a review of other characteristics of NPs that appear pre-predicatively.

**2.4.3.4.1 Unaccusativity** Since Perlmutter (1978), it is widely assumed that there are two types of intransitive verbs: unergative verbs, which involve an agent, and unaccusative verbs, which involve only a patient (theme). Especially among generative linguists, it is also assumed that the argument of an unergative verb syntactically appears in the same position as the subject (A) of transitive clauses, while the argument of an unaccusative verb appears in the same position as the object (P) of transitive clauses. Kageyama (1993), who applied this idea to Japanese, provides rich examples to support this analysis of the surface structures of Japanese sentences. As can be seen in examples (88) to (90), *otoko-no ko* 'boy' – which is the argument of an unergative verb in (89) – appears in the same position as *kodomo* 'child' in (88) – which is the subject (A) of a transitive verb. On the other hand, *ki-no eda* 'tree branch' in (90), which is the argument of an unaccusative verb, appears in the same position as *ki-no eda* in (88), the object (P) of a transitive verb .

(88) **Transitive verb**

a. kodomo-ga child-nom **ki-no** tree-gen **eda**-o branch-acc ot-ta break-past 'A child broke a tree branch.'

(Kageyama 1993: 46)

(89) **Intransitive (Unergative) verb**

a. otoko-no male-gen **ko**-ga child-nom abare-ta go.violent-past 'A boy went violent.'

2 Background

The important point for our purposes is that the arguments of unaccusative verbs and the objects (P) of transitive verbs are structurally closer to the verb; i.e., they appear pre-predicatively in Japanese, which is basically a verb-final language.

**2.4.3.4.2 Focus** Kuno (1978) and Takami (1995a) point out that pre-predicate elements are foci ("most important information"). Endo (2014: §4.2.) also states that foci appear pre-predicatively. Compare the following examples. In (91-A), where 'Boston' appears pre-predicatively and is preceded by 'Hanako', responding only to Boston is felicitous (91-A), while responding only to Hanako is not (91-A′ ).

	- A ′ : \*un yeah **hanako-to** Hanako-with it-ta-yo go-past-fp 'Yeah, I went with Hanako.' (Kuno 1978: 52)

2.4 Characteristics of Japanese

In (92), on the other hand, where 'Hanako' is preceded by 'Boston', responding only to Hanako is a natural answer, as illustrated in (92-A′ ), while responding only to Boston is not, as shown in (92-A).

	- A: \*un yeah **bosuton-ni** Boston-dat it-ta-yo go-past-fp 'Yeah, I went to Boston.'
	- A ′ : un yeah **hanako-to** Hanako-with it-ta-yo go-past-fp 'Yeah, I went with Hanako.' (Kuno 1978: 54)

This implies that focus appears pre-predicatively. The results reported in Chapter 5 basically support this observation.

**2.4.3.4.3 Remaining issues** The observations discussed in the literature above imply that Ps, the arguments of unaccusative verbs, and foci appear pre-predicatively. The results in Chapter 5 show that both patienthood and newness contribute to word order in Japanese. The next question is what kind of theory allows both patients and new elements to appear pre-predicatively. Throughout this study, I aim at showing the plausibility of a theory that captures multiple variables at the same time, i.e., the theory of competing motivations (Du Bois 1985).

### **2.4.4 Intonation**

I employ the term intonation and prosody roughly in the same way. Here I outline studies on the associations between intonation and functions such as information structure. For detailed phonetic descriptions and analyses of Japanese intonation, see Beckman & Pierrehumbert (1986); Pierrehumbert & Beckman (1988); Sugito (1994b); Venditti (2000); Igarashi et al. (2006); Igarashi (2015). Also, I only discuss units smaller than the clause; I do not discuss discourse structure although there are many interesting interactions between intonation and discourse structure in Japanese (e.g., Nakajima & Allen 1993; Venditti & Swerts 1996; Murai & Yamashita 1999; Koiso et al. 2003; Okubo et al. 2003; Koiso & Ishimoto 2012). I focus on studies on intonation units and information structure.

### 2 Background

### **2.4.4.1 Definition of intonation unit**

Before reviewing the previous literature, I briefly discuss how an intonation unit is defined. The definition of intonation unit makes use of a labeling system for Japanese prosodic information called X-JToBI, which has already been annotated in *the Corpus of Spontaneous Japanese*. I discuss X-JToBI in the following paragraph, and introduce intonation units afterwards.

**2.4.4.1.1 X-JToBI and intonational phrases** X-JToBI (Maekawa et al. 2002; Igarashi et al. 2006) is based on J-ToBI, proposed in Venditti (1997; 2000) – which is itself modified from ToBI (Tones and Break Indices), a labeling system for English prosody (Silverman et al. 1992; Pitrelli et al. 1994; Beckman & Elam 1997).

Here I mainly discuss the break indices (BI) tier of X-JToBI since this is the most relevant feature for intonation units. The BI labelings are determined by human annotators and represent the strength of the prosodic boundaries (Maekawa et al. 2002; Igarashi et al. 2006). BI labelings basically consist of 1, 2, and 3. <sup>28</sup> <sup>1</sup> corresponds to a word boundary, 2 corresponds to an accentual-phrase boundary, and 3 corresponds to an intonational-phrase boundary. An intonational phrase consists of more than or equal to one accentual phrase. An accentual phrase consists of a pitch contour with a single F<sup>0</sup> peak. Intonational-phrase boundaries are the place where a pitch reset occurs; if the pitch range of the current accentual phrase is smaller than the next accentual phrase, an intonational-phrase boundary is identified in the current accentual-phrase boundary.

Below is an example of an intonational-phrase boundary (label 3), the boundary type most relevant to our study. Figure 2.1 shows the pitch contour of the utterance in (93).

(93) aoi blue yane-no roof-gen ie-ga house-nom mieru visible 'A house with the blue roof is visible.'

The vertical lines in the figure across the pitch contour indicate the peak and the bottom of F<sup>0</sup> . A contour with a single pitch peak corresponds to a single accentual phrase. Comparing the first (*aoi* 'blue') and the second (*yane-no* 'roofgen') accentual phrases, the pitch range of the second is smaller than the first one; i.e., downstepping occurs in the second accentual phrase. Downstepping, a.k.a. catathesis, is "a phonological process by which the [pitch] range is compressed after a lexical accent" (Venditti (2000: 17), see Poser (1984); Beckman &

<sup>28</sup>In addition, there are diacritics: m, -, p. There are also labels for disfluency; word fragments, fillers, and so on. See Igarashi et al. (2006) for a detailed description.

### 2.4 Characteristics of Japanese

Pierrehumbert (1986); Pierrehumbert & Beckman (1988); Kubozono (1993)). In Figure 2.1, the first accentual-phrase boundary is not an intonational-phase boundary. On the other hand, comparing the second (*yane-no* 'roof-gen') and the third (*ie-ga* 'house-nom') accentual phrases, the second pitch range is smaller than the third one. Therefore, the second accentual-phrase boundary is an intonationalphrase boundary. 412 第 7 章 韻律情報

[図] 図 7.61 参照。

ޠ㧕࡞┓ࠛࡒࠟ┓ࠗࠛࡁࡀ┓ࡗࠗ┓ࠝࠕ㧔ࠆ߃߇ኅߩደᩮ㕍ޟ ࡦ࠴࠶ࡇޔࠇࠄಾߜᢿ߇ലᨐߩࡊ࠶࠹ࠬࡦ࠙࠳ߢฏ࠻ࡦࠢࠕߩ≼/ߟ <sup>4</sup>ޔ߇ࠆߡߒ⛯ㅪߟ <sup>5</sup> ߇ฏ࠻ࡦࠢࠕᩭ 図 7.61 BI=3:有核アクセント句の連鎖 Figure 2.1: An example of annotation of BI (Igarashi et al. 2006: 412)

ޕࠆߥߣ

<sup>0164</sup>ޔߚࠆߡߓ↢߇ᄢߩࠫ

7.2.2.4 1+ [機能] • 語境界(BI=1)とアクセント句境界(BI=2)の中間を示す。 [説明] • つまりアクセント句境界の有無が不明のとき(BI=1 <sup>か</sup> BI=2 か迷うとき)用いる。 [図] 図 7.62 参照。 "#!! -- - \$ !!! 78 #"!! ጀࡦ࠻ ಽ▵㖸ጀ , ! % (! ! ! .! ! , ! '! ! ! + ! % ! ! &! ! (! ! & ! '! ! 9! ! ! ! % න⺆ጀ**2.4.4.1.2 Intonation unit** Based on X-JToBI, Den et al. (2010) and Den et al. (2011) propose the definition of intonation unit which I will employ in this study. They call it short utterance-unit as opposed to long utterance-unit, but I use the term "intonation unit (IU)" throughout since I do not discuss long utterance-units. An intonation-unit boundary is identified where there is an intonational phrase (the boundary labelled as <sup>3</sup> in CSJ) discussed above, a clause boundary,<sup>29</sup> or a pause equal to or more than 0.1 seconds. As discussed in Enomoto et al. (2004), it is difficult for human annotators to agree when deciding on intonation-unit boundaries based on the system proposed in Du Bois et al. (1992) and Iwasaki (2008). Den and his colleagues made it possible to identify intonation units in spontaneous speech consistently across annotators.

 ! ! ! ,%(.,' +% ! &(&/'! ! ! ! ! 9% <sup>01</sup> <sup>ጀ</sup> <sup>3</sup> 3: <sup>3</sup> <sup>4</sup> ޠ㧕࠳ࠗ┓࡛ࠟࠝࡒ࡙ࡑ㧔ߛᵒ߇⟥↱⌀ޟ ࠆߢ࿎㔍߇ᢿ್ߩ߆ࠆߡߒ᭴ᚑࠍฏ࠻ࡦࠢࠕߩߟ <sup>2</sup>ޔ߆ࠆߡߒⲢว߇ࠄࠇߘޔߒ㓞ធ߇▵ᩭᢥߣ▵ήᩭᢥ ޕࠆߡࠇࠄ↪߇0163: ߚ 図 7.62 BI=1+ [注意] • 1+ に対応するトーン層には,いかなるラベルも付与しない。 In the following section, however, I review studies on various kinds of intonation units including those defined Du Bois et al. (1992); Maekawa et al. (2002); Iwasaki (2008); Den et al. (2011). Also, whereas prominence marking, down-stepping, and boundary pitch movements are more popular topics than intonation units, I review those studies in relation to the current study. See Venditti et al. (2008) for an overview of such studies.

を「+」記号の後に明示することが推奨される。

• BI の値の判断に迷う場合は,BI=1+p(7.2.2.5 節参照)のように,判断を困難にさせる音声学的根拠

<sup>29</sup>To be more precise, this is a long utterance-unit boundary. See Den et al. (2011) for the definition of this unit.

### 2 Background

### **2.4.4.2 Intonation units and related phenomena**

In this section, I present a review of the literature on the association between prosodic units and related characteristics of language. Note again that the review includes various kinds of prosodic units based on slightly different definitions, although they agree in many cases.

**2.4.4.2.1 Prominence and downstepping** Prominence and downstepping are crucial features in determining intonation units. It is well known that a focus receives prominence (pitch peak). Pierrehumbert & Beckman (1988: 99–101) report that "sequences with focus on the noun almost always had an intermediate phrase [i.e., intonational phrase] boundary between the adjective and the noun[...] an intermediate phrase boundary blocks catathesis [i.e., downstepping]". The conclusion was reached through production experiments where subjects were asked to produce a sequence of an adjective and a noun with different focus positions. The target sentences and contexts used by Pierrehumbert and Beckman are like the ones in (94). The capital letters indicate that those words are in focus, and the bold-faced letters indicate that they are the target of analysis.

(94) Q: [In America,] are there sweet beans or carrots like there are in Japan? A: amai sweet NINZIN-wa carrot-top ari-masu-ga exist-plt-though **amai** sweet **MAME**-wa bean-top ari-mase-n exist-plt-neg 'There are sweet CARROTS, but there aren't sweet BEANS.' (Pierrehumbert & Beckman 1988: 59)

Pierrehumbert and Beckman showed that there is an intonational phrase (i.e., intermediate phrase) boundary between the adjective (*amai* 'sweet' in (94-A)) and the noun (*mame* 'bean' in (94-b)) when the noun is a focus, as in (94). Although the results are complicated, they conclude that their generalization applies to both accented and unaccented words.<sup>30</sup>

<sup>30</sup>Kubozono (2007) compared two definitions of downstepping (syntagmatic and paradigmatic) and investigated whether a pitch reset occurs before the focus. He found conflicting results: from a syntagmatic perspective, the focus receives higher pitch than the preceding phrase, which indicates that downstepping is blocked. From a paradigmatic perspective, on the other hand, he had to conclude that downstepping is not blocked before the focus. The present study employs the definition of syntagmatic downstepping and assumes that the conclusions in Pierrehumbert & Beckman (1988) and Kubozono (2007) do not contradict each other. See Kubozono (2007) for detailed discussion on this issue.

### 2.4 Characteristics of Japanese

**2.4.4.2.2 Focus projection** There has been a cross-linguistic question of how human beings distinguish broad focus and narrow focus: the issue of focus projection. This has been investigated for English, German and Dutch (Selkirk 1984; Gussenhoven 1983). Ito (2002), who investigated this question in Japanese, compared the response time and acceptability of each of the intonation types in (95-A1-A3) followed by a broad focus question like (95-Q). The capital letters indicate the phrases whose pitch range is expanded.

	- A1: kare-wa 3sg-top **DAIBINGU-o** diving-acc **HAZIMERU-n-da-yo** begin-nmlz-cop-fp 'He starts (scuba) diving.'
	- A2: kare-wa 3sg-top **DAIBINGU-o** diving-acc **hazimeru-n-da-yo** begin-nmlz-cop-fp 'He starts (scuba) diving.'
	- A3: kare-wa 3sg-top **daibingu-o** diving-acc **HAZIMERU-n-da-yo** begin-nmlz-cop-fp 'He starts (scuba) diving.' (Ito 2002: 412)

Ito found that "though dual prominence [like (95-A1)] is preferred for answers to broad focus questions, utterances with a single intonational prominence on the object [like (95-A2)] may be comprehended equally quickly as those with dual prominence" (op.cit.: 413) – where A1 is significantly more acceptable than A2. Also, she reports that the response time and acceptability of the A3-type do not significantly differ from those of A1 and A2. She concluded that "it is possible that the relation between argument structure and intonational focus marking is not universal" (ibid.).

Kori (2011) investigated the intonation of broad and narrow focus and reports that, by default, only the first word receives pitch peak, whereas the following word is suppressed – although some speakers put prominence on the second word too. (96-a) is the target sentence that he asked participants to read aloud and (96-b-c) are the contexts. In (96-b-c), both *aoi* 'blue' and *mahuraa* 'scarf' are focused, because both of them contrast with 'red' and 'gloves' or 'sweater', respectively. In (96-d), *aoi* 'blue' is narrowly focused because it is the only element that contrasts with 'red', while 'scarf' is not contrasted.

(96) a. **aoi** blue **mahuraa**-dat-ta-n-desu scarf-cop-past-nmlz-cop.plt '(It) was a blue scarf.'

### 2 Background


Kori concludes that the default intonation for broad focus is to suppress the second word (*mahuraa* 'scarf' in this case) because most of the participants produced the sentences as such, although some participants chose the sentence with prominence both on *aoi* 'blue' and *mahuraa* 'scarf' when they were asked to choose a good sentence.

**2.4.4.2.3 Functional and cognitive motivations for intonation units** Iwasaki (1993), applying the style of IU identification proposed in Du Bois et al. (1992) and Chafe (1994) to Japanese, argues that a Japanese intonation unit corresponds to a phrase rather than a clause, in contrast to the English IU, which corresponds to a clause according to Chafe (1987; 1994). According to Iwasaki's survey, 42.2% of IUs in Japanese are clausal, whereas 57.8% are phrasal. Their intonation unit is a "stretch of speech uttered under a single coherent intonation contour" (Du Bois et al. 1992: 17). Iwasaki (1993: 39) states that the beginning of an IU "is often, though not always, marked by a pause, hesitation noises, and/or resetting of the baseline pitch level", whereas the ending of an IU "is often, again though not always, marked by a lengthening of the last syllable." Iwasaki (1993) provides (97) to exemplify how intonation units in Japanese correspond to phrases. Each line in (97) corresponds to a single intonation unit and (97-a-e) as a whole consist of a single proposition, "I heard that broadcast at home with my family."

	- b. uti-de home-loc kii-ta-no-ne? hear-past-nmlz-fp 'heard at home, you know...'
	- c. sono that are-wa-ne? that-top-fp 'that thing, you know...'
	- d. hoosoo-wa-ne? broadcast-top-fp 'that broadcast, you know,'
	- e. kazoku-de. family-with 'with my family.' (Iwasaki 1993: 40)

2.4 Characteristics of Japanese

The pitch and intensity of (98) are shown in Figure 2.2 from Iwasaki (2008: 109), in which the same example and figure are explained. The IU (98-a) ends with final vowel lengthening, whereas boundary pitch movements are observed in the ending of IUs (98-b-d), which are indicated by "?". (98-e) ends with a final lowering, indicated by ".".

Iwasaki distinguishes between four types of "functional components":

### (98) **Four functional components**


Based on this, he shows similarities among different IUs. For example, (99-a) is an IU which only contains an NP followed by particles, whereas (99-b) is an IU which only contains a VP, also followed by particles. The structure of these two IUs is essentially the same in terms of functional components, although they are different in terms of grammatical structure.

(99) a. [mami-ni-dake] Mami-dat-only **ID** [-wa] -top **CO** [-ne] -fp **IT** b. [ik-ase-ta-rasii] go-caus-rep **ID** [-no] -nmlz **CO** [-yo] -fp **IT** '(I heard that she) let only Mami go.'

Iwasaki analyzed his data based on his classification and found that more than 80% of the IUs consist of two or less functional components. He states that "this might be due to the limitation of work that the speaker can handle within one IU. [...] Japanese speakers [...] are faced with a constraint which permits them to exercise up to two functions per intonation unit" (p. 49).

By contrast, Matsumoto (2000: 68) reports that "one clause comprises an average of 1.2 IUs" and argues that "the clause is the syntactic exponent of Japanese substantive IU". She proposes the "one new NP per IU" constraint in Japanese, comparing it to the one new idea at a time constraint in Chafe (1987; 1994). However, Matsumoto (2003: §5.6) also reports that one new or given NP per IU is

### 2 Background

Figure 2.2: Example of an intonation unit (Iwasaki 2008: 109)

preferred in Japanese conversation. Therefore, new as well as given NPs appear in an intonation unit without other NPs.

Nakagawa et al. (2010) focused on the difference between phrasal IUs and clausal IUs and analyzed them in terms of information structure. They measured referential distance and persistence (Givón 1983) and concluded that one of the functions of phrasal IUs is to introduce or re-introduce important topics in discourse. They compare this function of phrasal IUs to left-dislocations observed in many languages.

**2.4.4.2.4 Remaining issues** Most studies on phonetics and phonology concentrate on foci rather than topics. Among different focus types, most of the studies (except for those on focus projection) concentrate on narrow focus rather than broad focus. Moreover, almost all of them are experimental studies rather than corpus studies. By contrast, I focus here on the differences between broad foci and topics in spontaneous speech, although I also carry out a production experiment.

Previous functional studies such as Iwasaki (1993); Matsumoto (2000; 2003); and Nakagawa et al. (2010) have methodological issues since they rely on an impressionistic definition of intonation units. This study, on the contrary, is based on a strict definition of intonation unit and aims at revealing associations between intonation and information structure.

The results in Chapter 6 show that an intonation unit corresponds to a unit of information structure – e.g., topic or focus – which frequently but not always overlaps with a unit of the syntactic structure.

2.5 Summary

### **2.4.4.3 Pause**

Sugito (1994a)showed in a perceptual experiment that pauses appear before pitch reset. She recorded trained announcers reading the news and had subjects listen to the recording. She found that, when pauses were eliminated, subjects perceived the voice as though two people were overlapping with each other when the pauses were substituted by pitch resets. According to her, it is in fact impossible to reset pitch without pauses and vocal cords are tensed 0.1 seconds before speech production. Based on this, I assume that pauses correlate with pitch reset.

### **2.5 Summary**

In this chapter, I outlined the previous literature on topics and foci as well as the characteristics of Japanese relevant to this study, and enumerated the remaining questions to be investigated.

In Chapters from 4 to 6, I investigate the associations between information structure and particles, word order, and intonation in spoken Japanese. Before this, in the next chapter I introduce the framework adopted in this study.

# **3 Framework**

### **3.1 Introduction**

In this chapter I describe the framework adopted in the present study. First, in §3.2, I introduce the theory of conceptual space assumed throughout. Then, I define the concepts of 'topic' and 'focus' I adopt, and describe the features that have been proposed to be associated with information structure phenomena (§3.3). Finally, §3.4 explains the characteristics of the corpus to be investigated and how to annotate features correlating with topic and focus.

To investigate the cognitive motivations of some linguistic category (e.g., topic and focus), it is possible to use a variety of clues, such as generalizations about typological tendencies, models of language processing, theories of language change and language contact, language acquisition processes, and language production data, as well as traditional grammaticality and acceptability judgements of sentences. This study mainly employs language production data (a.k.a. corpora) and sentence acceptability, two methods that directly reflect the intuition and cognition of adult native speakers of Japanese. Sometimes I also use production experiments to obtain enough data under controlled contexts. It is necessary to investigate other kinds of clues, such as typological tendencies, language processing models, and language acquisition processes of many other languages to reveal how cognition is reflected in human language in general. I hope that this study contributes to this larger goal.

The study restricts itself only to standard Japanese. One reason for choosing this language is that there are few empirical studies on information structure in spoken Japanese, while there are at least preliminary empirical studies on other languages, such as some European and African languages (e.g., Cowles 2003; Dipper et al. 2004; 2007; Ritz et al. 2008; Skopeteas et al. 2006; Cook & Bildhauer 2013; Chiarcos et al. 2011). Another reason for my language choice is that a large spoken corpus of standard spoken Japanese is available. The corpus is called *the Corpus of Spontaneous Japanese* (CSJ) and is morphologically analysed and annotated with a variety of information such as accentual phrases, intonation, parts of speech, dependent structures in addition to basic transcriptions of speech

### 3 Framework

(Maekawa 2003; Maekawa et al. 2004). I will describe characteristics of the corpus in §3.4.3.

### **3.2 Conceptual space and semantic maps**

Throughout this study, I assume a theory of conceptual space (Croft 2001; Haspelmath 2003). A conceptual space is a multi-dimensional model of concepts sensitive to some linguistic function(s). As Croft (2001: 93) states, "conceptual space is a structured representation of functional structures and their relationships to each other. [...] Conceptual space is also multidimensional, that is, there are many different semantic, pragmatic, and discourse-functional dimensions that define any region of conceptual space". The representation is claimed to be universal. An example of conceptual space is shown in Figure 3.1. This is a conceptual space of parts of speech. The horizontal dimension given in capital letters indicates "the constructions used for the propositional acts of reference, modification, and predication" (Croft 2001: p. 93). The vertical dimension indicates the semantic classes of "the words that fill the relevant roles in the propositional act constructions" (op.cit.: 94).

Whereas "the conceptual space is the underlying conceptual structure, [...] a semantic map is a map of language-specific categories on the conceptual space" (p. 94). While conceptual space is supposed to be universal, semantic maps are language-specific. Figure 3.2 is an example of a semantic map of parts of speech specific to Japanese. The dimensions are suppressed for of convenience. The figure shows that nouns such as *hon* 'book' accompany *no* to modify another noun and *da* for predication. Adjectives such as *yasu* 'cheap' accompany *i* for both modification and predication. Some nominal adjectives between 'book' and 'cheap' such as *heewa* 'peace(ful)' and *kenkoo* 'health(y)' accompany both *no* and *na* for modification and *da* for predication. They are different from but similar to nouns such as 'book'. Some nominal adjectives such as *atataka* 'warm' and *tiisa* 'small' accompany both *na* and *i* for modification, and 'warm' allows both *da* and *i* to follow in predication. This indicates that they are similar to adjectives rather than nouns. The nominal adjective *kirei* 'pretty' is inbetween; it only allows *na* for modification and *da* for predication.

"The hypothesis of typological theory, including Radical Construction Grammar, is that most grammatical domains will yield universals of the form-function mapping that can be represented as a coherent conceptual space" (p. 96), which is explicitly stated in (1).

### 3.2 Conceptual space and semantic maps




Figure 3.2: The semantic map for the Japanese Nominal, Nominal Adjective, and Adjective constructions (Croft 2001: 95)

(1) **Semantic Map Connectivity Hypothesis**: any relevant language-specific and construction-specific category should map onto a **connected region** in conceptual space. (ibid.)

Japanese parts of speech in Figure 3.2 support this hypothesis. For example, morphemes such as *no* and *na* map onto distinct yet connected regions on the conceptual space. If the adjective suffix *i* could also attach to *hon* 'book', but not to *kirei* 'pretty', for example, this would be a counter-example to the hypothesis.

There are also conceptual spaces for information structure, and here I aim to describe semantic maps of information structure in Japanese. In terms of the theory of conceptual space, each feature that has been proposed to correlate with information structure (to be discussed in the next section) is considered to be a dimension in the conceptual space. Hence, the question I am pursuing here can be restated as follows: what dimensions is Japanese sensitive to, and how do Japanese linguistic forms (i.e., particles, word order, and intonation) map onto the semantic map of information structure in this language?

### 3 Framework

In the following section, I outline the definitions of topic and focus I adopt and the features correlating with topic and focus that are considered to be dimensions of conceptual space for information structure.

### **3.3 Topic, focus, and correlating features**

It has been pointed out that there is a correlation between topics and referents that are activated, definite, specific, animate, agents, and inferable, and between foci and a referents that are inactivated, indefinite, non-specific, inanimate, patients, and non-inferable (Givón 1976; Keenan 1976; Comrie 1979;1983). They form a prototype category; e.g., topics are typically (i.e., frequently) but not always definite or animate, and foci are typically but not always indefinite or inanimate. I propose that the feature *presupposed* is a necessary feature of topic, while the feature *asserted* is a necessary feature of focus. On the other hand, other features correlate with topic and focus respectively but are not necessarily topics or foci themselves. The features correlated with topic and focus are summarized in (2).


As will be shown in the following chapters, topic and focus are heterogeneous and have the complex features proposed in (2).

In this section, I will define each term in (2).

### **3.3.1 Topic**

A linguistic form is considered to represent a topic if it has the characteristics in (1) in §2.2.1, repeated here as (3).

(3) The topic is a discourse element that the speaker assumes or presupposes to be shared (known or taken for granted) and uncontroversial in a given sentence both by the speaker and the hearer.

### 3.3 Topic, focus, and correlating features

Since the proposition that "the speaker assumes or presupposes to be shared both by the speaker and the hearer" is too long and complicated, this statement is sometimes shortened to "shared by the speaker and the hearer" to mean the same thing. Remember that the statement is always the speaker's assumption and hence avoids the paradox pointed out in Clark & Marshall (1981). The topic is by definition presupposed to be shared both by the speaker and the hearer. By "topic is shared", I mean that topics are either evoked, inferable, declining, or unused in terms of the given-new taxonomy (2) in §2.2.1. By "topic is presupposed", I mean that the speaker assumes that the hearer takes it for granted that the referent or the proposition being mentioned is known or accepted both by the speaker and the hearer. See also the discussion in §2.2.1.

The notion of *uncontroversial* is also important: topics cannot be questioned or argued against in a normal manner. For instance, English noun phrases preceded by *as for* or *regarding* cannot be questioned or argued against. Assuming that expressions like *regarding* and *as for* introduce topic expressions (Kuno 1972; 1976; Gundel 1974), this supports the idea that topics cannot be questioned or argued against. In (4), for example, *John* preceded by *as for* or *regarding* cannot be felicitously argued against as shown in (4-B2,B2′ ), whereas *a teacher*, which is considered to be focused, can be argued against as in (4-B2′′).

	- [{As for/Regarding} John] , [he] [is a teacher].
	- B2: ??No, **Rob** is a teacher.
	- B2′ : ??No, {as for/regarding} **Rob**, he is a teacher.
	- B2′′:No, John is **an engineer**.

In other words, topic expressions cannot be corrected by the next speaker in a normal manner. I call this type of test the *no*-test (see also the lie-test in Erteschik-Shir (2007: 39)).

Careful readers might think that it is perfectly natural to produce an utterance like (5), which is very similar to (4-B2), speculating that the *no*-test is a flawed test. The capital letters in (5) indicate that those words are stressed.

(5) B2: No, ROB is a teacher, not JOHN.

However, this does not mean that the test is flawed. Note that the participants in this conversation would not be satisfied with only (5); John's information needs

### 3 Framework

to be provided. Therefore, a "complete" conversation would be something like (6).

	- B2: No, ROB is a teacher, not JOHN. (=(5))

A3: Then what is John?

B4: I guess he is an engineer.

This suggests that once B says *no*, s/he must provide an alternative to the focus (as long as s/he knows). I am inclined to label *ROB* in (6-B2) as focus and think that the existence of examples like (5-B) does not invalidate the *no*-test.

Further, it is unnatural to overtly receive topics as news, since overt acceptance means that they could be controversial. For instance, as shown in (7-B2), topics cannot be repeated as news by the next speaker, who has heard the utterance in (7-A1), whereas there is no problem to repeat the focus as news, as shown in (7-B2′ ).

(7) A1: [{As for/Regarding} John] , [he] [is a teacher]. B2:??Aha, **John**. B2′ : Aha, **a teacher**.

I call this test the *aha*-test. The *aha*-test is a natural consequence of the fact that the truth value of a sentence is assessed with respect to the topic (Strawson 1964).

Let us see specific examples of topics. For instance, as will be shown in Chapter 4, preposed zero-coded elements (elements without any overt particles) correspond to topics in Japanese, since the referent referred to by the preposed element is presupposed to be shared between the speaker and the hearer – as is *nezumi* 'mouse' in (8), where Ø indicates "a zero particle".

	- Y: **nezumi-Ø** nezumi-Ø neko-ga cat-*ga* tukamae-ta-yo catch-past-fp 'The cat caught (the) mouse.'

The referent 'mouse' is interpreted as being shared between the speaker and the hearer; when this is not the case, as in (9), the utterance is infelicitous as shown by the contrast between (9-Y) and (9-Y′ ).

3.3 Topic, focus, and correlating features

	- H: Anything fun today?
	- Y: ??**nezumi-Ø** mouse-Ø neko-ga cat-*ga* tukamae-ta-yo catch-past-fp Intended: 'The cat caught a mouse.' (=(8-Y))
	- Y ′ : neko-ga cat-*ga* **nezumi-Ø** mouse-Ø tukamae-ta-yo catch-past-fp 'The cat caught a mouse.'

When the mouse is not shared between the speaker (Y) and the hearer (H), the preposed *nezumi* 'mouse' is infelicitous as in (9-Y), while *nezumi* in the pre-predicate position is felicitous as in (9-Y′ ).

Some readers might think that preposed zero-coded elements do not necessarily correspond to topics. Instead, they might suspect that they correspond to foci, since *nezumi* 'mouse' in (8) is somehow "new" to the discourse, or, more precisely, it is not activated before the time of utterance (8-Y). However, as discussed below, foci are not subject to a constraint such that their referent must be assumed to be shared by the speaker and the hearer. Typically, foci are indefinite referents that are not shared, as specified in (2). Since preposed zero-coded elements in Japanese do not refer to indefinite referents, as shown in (9), I categorize them as topics.

### **3.3.2 Focus**

A linguistic form is considered to represent focus if it has the characteristics given in (16) in §2.3.1, repeated here as (10) for convenience.

(10) The focus is a discourse element that the speaker assumes to be news to the hearer and possibly controversial. S/he wants the hearer to learn the relation of the presupposition to the focus by his/her utterance. In other words, focus is an element that is asserted.

A focused discourse element is news in the sense that the hearer is assumed not to know the relationships between the element and the presupposition. For example, consider the following example:

	- A: hanako-ga Hanako-*ga* wat-ta-n-da-yo break-past-nmlz-cop-fp

### 3 Framework

'HANAKO broke (it).' Presupposition: "x broke the window." Assertion: "x = Hanako"

In (11-A), *hanako* is shared in the sense that her existence and identity are known by the speaker and the hearer. However, *hanako* is also news in relation to the presupposition "x broke the window" at the time of utterance (11-Q). The speaker of (11-A) lets the hearer learn the proposition that is assumed to be news: "x = Hanako." *Hanako* is the focus because this is the part where the assertion is different from the presupposition.

I also emphasize that the speaker thinks that the focus might be *controversial*. This implies that another participant of the conversation can potentially argue against the focus statement. Therefore, the focus can be felicitously negated by the next speaker, whereas the topic cannot. This is exemplified in (4), repeated here as (12).

(12) A: Do you remember the guys we met at last night's party? Their names are Karl and John, I guess. Karl is doing linguistics at the grad school of our university. I forgot what languages he speaks. [{As for/Regarding} John] , [he] [is a teacher]. B: ??No, **Rob** is a teacher. B ′ :??No, {as for/regarding} **Rob**, he is a teacher. B ′′: No, John is **an engineer**.

As shown in (12), (part of) the focus *a teacher* can be negated felicitously, whereas the topic *John* cannot be negated felicitously. The concept of controversiality is more hearer-oriented and interactional than the previous notions of assertion, unpredictability, and unrecoverablity. See also the discussion in §2.3.

### **3.3.3 Information structure in the sentence**

Here I discuss different types of information structure. Following Lambrecht (1994), I distinguish three types of information structure within a sentence: the **predicate-focus structure** (topic-comment structure), the **sentence-focus structure**, and the **argument-focus structure**.

In **the predicate-focus structure** or topic-comment structure, the predicate is the focus, as the name suggests. The predicate may include the complement of the predicate. This is exemplified in (13-A) for English, where the capital letters represent prominence.

3.3 Topic, focus, and correlating features

(13) Predicate-focus structure


(14-A) is an example of predicate-focus structure in Japanese.

	- A: [**Hanako**-wa] Hanako-*wa* [syoosetu-o novel-*o* yon-deru] -yo read-prog-fp 'Hanako is reading a novel.'

In **the sentence-focus structure**, the whole sentence is focused. This is exemplified in (15-A) for English, where, again, the capital letters indicate stress.

(15) Sentence-focus structure


A Japanese example of a sentence-focus structure is shown in (16-A).

(16) Sentence-focus structure


In the sentence-focus structure, there is no explicit topic and all the arguments (e.g., *the children* and *school* in (16-A)) are part of the focus. However, if one assumes stage topics (Erteschik-Shir 1997; 2007), the distinction between the predicate-focus and the sentence-focus structures may not be clear. In (17-a), for example, *kyoo* 'today' might function as a topic in the sense that the truth value of the sentence is evaluated with respect to the specific time 'today' (although, in this study, I do not examine stage topics in detail).

	- b. [**Hanako**-wa] Hanako-*wa* [syoosetu-o novel-*o* yon-deru] -yo read-prog-fp 'Hanako is reading a novel.'

### 3 Framework

Note that, in terms of information structure, (17-a) is similar to (17-b), which has a predicate-focus structure. The predicate-focus and sentence-focus structures are similar in that the predicate is in the domain of focus. For this reason, I sometimes put the predicate-focus and sentence-focus structures into the same category and refer to them as **broad focus structures**.

In **the argument-focus structure**, elements other than the predicate are focused. This is exemplified in (18-A) for English and (19-A) for Japanese. This structure is sometimes referred to as the **narrow focus structure** as opposed to the broad focus structure because the domain of focus is limited to arguments or other elements except predicates.

	- Q: Who went to school?
	- A: [The CHILDREN] [went to school] . (Lambrecht 1994: p. 121)
	- Q: Who is reading a book?
	- A: [hanako-ga] Hanako-*ga* [syoosetu(-o) book(-*o*) yon-deru] -yo read-prog-fp 'Hanako is reading a book.'

I distinguish between two types of components constituting an information structure: the discourse element and the discourse referent, each of which is defined as in (20):

	- b. **(Discourse) referent**: An entity or proposition that a discourse element refers to (if a referent is a proposition, it is also called **proposition**).

### **3.3.4 Other features correlating with topic/focus**

This section discusses the definition of features which have been proposed to correlate with topic and focus. Although I do not necessarily annotate all the features in my corpus, I discuss all of them, since, in some place or other, they are relevant for my proposals.

### 3.3 Topic, focus, and correlating features

### **3.3.4.1 Activation cost**

The activation cost of a referent is the assumed cost for the hearer to activate the referent in question. An active referent is a referent that the speaker assumes to be in the attention of the hearer (for which the activation cost is hence low), while an inactive referent is a referent that the speaker does not assume to be in the attention of the hearer (for which the activation cost is high) (see also Chafe 1994: inter alia).<sup>1</sup> Typically, referents are assumed to be brought to the hearer's attention by mentioning them or putting them in the hearer's area of visual perception.

A topic referent is often, but not always, activated in the hearer's mind. In (8), the referent 'mouse' is not necessarily considered to be active in H's mind. Although the mouse kept bothering Y and H when they were in their room, it is not appropriate for the speaker to assume that the mouse is in H's attention in the school, when the speaker happened to talk to H.

According to Dryer (1996), a focus is an element that is not activated. While this generalization well captures the view that the focus is the stressed linguistic element, I will not employ this definition: if *nezumi* 'mouse' in (8) is a focus, one has to come up with an explanation for why it is assumed to be shared between the speaker and the hearer, which is typically not the case with focus. According to my account, on the other hand, *nezumi* 'mouse' in (8) is a topic since its characteristics are in accordance with the topic correlation features in (2) and a special account for why *nezumi* 'mouse' is shared is not necessary. For a detailed discussion of the relationships between focus and stress, see Lambrecht (1994: Chapter 5).

A focus referent, on the other hand, is typically assumed not to be active in the hearer's mind. As Lambrecht (1994) has pointed out, the most frequent focus structure is the predicate-focus structure as in (21-A,B), where elements included in the predicate focus are typically not active in the hearer's mind.

	- A: [watasi-wa] 1.sg-*wa* [tomodati-to friend-with resutoran-de restaurant-loc supagetii spaghetti tabe-ta] -yo eat-past-fp 'I ate spaghetti with (a) friend in (a) restaurant.'
	- B: [boku-wa] 1.sg-*wa* [uti-de home-loc hon book yon-de-ta] -yo read-prog-past-fp 'I was reading (a) book at home.'

<sup>1</sup> I am using the term *attention* rather than *consciousness* because I believe the speaker's ability to evaluate the hearer's state of mind is eventually related to joint attention (Tomasello 1999).

### 3 Framework

In (21), it is reasonable to assume that Q did not have 'friend', 'restaurant', 'spaghetti', 'home', and 'book' in his/her attention at the time of utterance (21-Q).

There is another type of activation status: *semi-active*. I use the term *declining* specifically for the referent that has been active but starts to decline because other referents are also activated. Declining elements are in semi-active state.

### **3.3.4.2 Definiteness**

A definite referent is a referent that is unique in the domain of discourse, while an indefinite referent is a referent that is not unique in the domain of discourse.

The claim that "topic is a discourse element that the speaker assumes or presupposes to be shared (known or taken for granted) and uncontroversial in a given sentence both by the speaker and the hearer" in (3) might lead to the interpretation that the topic is definite. As has been pointed out in the literature (Givón 1976; Keenan 1976; Comrie 1979; 1983), topics tend to be definite. However, this is not a necessary nor sufficient feature of topics. Let us discuss the following: <sup>2</sup>

	- Y: **mangoo** mango konoaida the.other.day miyako-zima-de Miyako-island-loc tabe-ta-yo eat-past-fp '(I) ate (a) mango (we talked about) in Miyako island the other day.'

In (22) 'mango' is indefinite because the mango Y ate is not unique in the domain of discourse; H cannot uniquely identify which mango Y ate.<sup>3</sup> However, the element *mangoo* 'mango' is preposed because it has been discussed and hence is assumed to be shared between the speaker and the hearer. This makes it possible for *mangoo* to appear clause-initially as will be discussed in Chapter 5. I include this type of example in the category of unused, extending the term "unused" in Prince (1981).

However, some indefinite referents are more difficult to interpret as topics than others. For example, expressions such as *dareka* 'somebody' and *oozee-no hito* 'many people' are poor candidates for a topic as compared to other elements, judging from the fact that they cannot be followed by *wa*, but can be followed

<sup>2</sup> I am grateful to Yoshihiko Asao for pointing out this type of example.

<sup>3</sup>Yuji Togo and one of the reviewers (Morimoto) cast doubt on my claim that *mangoo* in (22) is indefinite; Rather, they suggest that it could be generic. I am reluctant to accept this view because this *mangoo* seems to refer to a specific (non-generic) mango that Y ate, as indicated by the past tense of the predicate *tabe-ta* 'eat-past'.

### 3.3 Topic, focus, and correlating features

by *ga*, as shown in (23) (Kuno 1973b: p. 37 ff.). As will be shown in Chapter 4, *wa* marks the element whose referent is assumed to be active in the hearer's mind; it codes active topics. On the other hand, as will also be shown in Chapter 4, *ga* marks focus elements.

	- b. **oozee-no hito-{??wa/ga}** paatii-ni ki-masi-ta many-gen person-*wa/ga* party-to come-plt-past 'Speaking of many people, they came to the party.'

A focus referent, on the other hand, tends to be indefinite rather than definite (Givón 1976; Keenan 1976; Comrie 1979; 1983; Du Bois 1987). As has been mentioned above, the most frequent focus structure is the predicate-focus structure exemplified in (21) and it is reasonable to assume that Q in (21) cannot identify the referents included in the predicate focus such as 'friend', 'restaurant', 'spaghetti', and 'book'.

It is natural for topic referents to be realized frequently by definite noun phrases. The participants typically talk about the person or the thing whose identity is known to them. In other occasions, they talk about people or things in general terms. This option is an exceptional case known as a generic reference, and it requires a special account. On the other hand, it is natural for focus referents to be frequently realized by indefinite noun phrases because, intuitively, an element that is not known by the hearer in relation to a presupposition is typically not shared between the speaker and the hearer.

### **3.3.4.3 Specificity**

A specific referent is fixed, i.e., the speaker has one particular referent in his/her mind; while a non-specific referent is not fixed, i.e., the speaker does not have one particular referent in mind (Karttunen 1969; Enç 1991; Abbott 1994). Turkish unambiguously codes specific and non-specific objects: if the NP is coded by the accusative case marker *-(y)i* (or *-(y)u*), it is interpreted as specific as in (24-a), while, if the NP is not overtly coded, it is interpreted as non-specific as in (24-b).

(24) a. Ali Ali bir one piyano-yu piano-acc kiralamak to.rent istiyor wants 'Ali wants to rent a certain piano.'

### 3 Framework

b. Ali Ali bir one piyano piano kiralamak to.rent istiyor wants 'Ali wants to rent a (non-specific) piano.' (Enç 1991: p. 4-5)

Specific referents like 'piano' in (24-a) are fixed in the sense that the speaker wants to rent a particular piano in his/her mind. Non-specific referents like 'piano' in (24-b) are not fixed in the sense that the speaker does not care which piano s/he could rent; any piano works in (24-b).

Topics are frequently but not always specific. Consider example (25), which is slightly modified from (22).

	- Y: **mangoo** mango raisyuu next.week miyako-zima-de Miyako-island-loc taberu-yo eat-fp '(I will) eat (a) mango (we talked about) in Miyako island next week.'

In this case, *mangoo* is non-specific because speaker Y does not know which mango he will eat. However, it is also the topic, for the same reason discussed in association with (22).

There is a concept that is related to but distinct from non-specificity: genericity. Generic referents do not represent an individual entity, rather, they represent a concept or a category. On the other hand, non-specific referents still represent an individual entity. According to Kuno (1972), generic referents are always available to be topics In (26), the element *kuzira* corresponds to a generic referent as the topic.

(26) **kuzira**-wa whale-*wa* honyuudoobutu-desu mammal-cop.plt 'A whale is a mammal.' (Kuno 1972: p. 270)

When participants talk about generic referents, the referent that is presupposed to be shared is the concept itself. Therefore, generic referents are always shared (unless the hearer has never heard the expression in question). As will be shown in Chapter 4, however, *wa* codes the element whose referent is assumed to be an active or semi-active inferable in the hearer's mind, and not all generic elements can be coded by *wa*.

Foci, on the other hand, can either be specific or non-specific, but they tend to be non-specific. In (27-A), the speaker may or may not have a particular book in his/her mind.

3.3 Topic, focus, and correlating features

(27) Q: What are you going to do tomorrow? A: [I] 'm going to [read **a book**] tomorrow.

In the example above, the specificity of the book in question is not important. Instead, the whole event of reading a book is more relevant to the question.

### **3.3.4.4 Animacy**

An animate referent is a living entity such as a human being, a cat, or a dog, while an inanimate referent is a non-living entity, such as a computer, a book, or love. Snakes, bugs, plants, and flowers are somewhere in between.

Topics tend to be animate, while foci tend to be inanimate (Givón 1976; Keenan 1976; Comrie 1979; 1983; Du Bois 1987). Although this study does not discuss animacy in detail, the notion is relevant to some aspects of the distinction between zero vs. overt particles, as briefly mentioned in Chapter 4.

### **3.3.4.5 Agentivity**

I employ the prototypes of agent and patient discussed in Dowty (1991: inter alia). An agent is a referent that typically has volition, has sentience, causes an event or change of state in another participant, or moves. On the other hand, a patient is a referent that typically undergoes a change of state, corresponds to an incremental theme, is causally affected by another participant, or is stationary relative to the movement of another participant.

Agentivity or subjecthood is often discussed in association with topics (Li 1976: inter alia). However, it is inaccurate to assume that a topic is limited to an agent or that an agent is always the topic. It is important to keep in mind that topics correlate with agents and subjects, but being an agent or subject itself is neither a necessary nor a sufficient condition to be a topic. Focus, on the other hand, correlates with patients. In the same way as with topics, however, it is inaccurate to assume that all foci are patients. The relationships between topic/focus and agentivity are discussed in Chapter 4, in association with the distinction between zero vs. overt particles.

### **3.3.4.6 Inferability**

The term *inferable* is borrowed from Prince (1981), though many other scholars have discussed similar concepts (e.g., Haviland & Clark 1974; Chafe 1994). A discourse referent is inferable "if the speaker assumes the hearer can infer it, via logical – or, more commonly, plausible – reasoning, from [discourse referents]

### 3 Framework

already [active] or from other inferables" (Prince 1981: p. 236).<sup>4</sup> A referent is inferable typically through the part-whole or metonymic relationships between the referent itself and another referent that has already been active. Inferable referents can be a topic by being assumed to be shared between the speaker and the hearer, or can be focus.

### **3.4 Methodology**

In this section, I will discuss the methods used in this study, based on the definitions and assumptions regarding topic and focus specified in the last section. This study employs acceptability judgements, production experiments, and corpus annotation, to be discussed in the following sections.

### **3.4.1 Topic and focus in acceptability judgements**

In acceptability judgements, I sometimes employ the *hee* test, where the element in question is focused if it can be repeated after the expression *hee* 'really', while it is not focused if it cannot. See also the discussions in §2.2.1, 2.3.1, 3.3.1, and 3.3.2. The *hee*-test is exemplified in (28).<sup>5</sup>


Let us assume that in (28–Taro) it is presupposed that something happened to Taro yesterday. Since there is always something happening to Taro, this presupposition is appropriate even in an out-of-the-blue context. Therefore, *ore* '1sg' is interpreted as topic, while *hebi mi-ta-n-da* 'snake see-past-nmlz-cop' is interpreted as focus in this particular context. Given this situation, the hearer of (28–Taro) can respond to this utterance as in (28–Jiro): while the focus part *hebi mi-ta-n-da* 'snake see-past-nmlz-cop' can be felicitously repeated followed by *hee* 'really', the topic part *ore* '1sg', which corresponds to *taroo* in (28–Jiro), cannot. Topics are identified negatively in this test. The assumption of the *hee* test is

<sup>4</sup>The terms are replaced according to this study's terminology.

<sup>5</sup> Read Jiro's utterance in (28) with exclamative intonation. Question intonation always works regardless of whether the element in question is a topic or a focus.

### 3.4 Methodology

that topics can never be taken as "news" or "a surprise" since they are assumed to be shared between the speaker and the hearer, while foci are expected to be "news" or "a surprise" to the hearer.

The expression *kinoo* 'yesterday' cannot be repeated either. I assume that this is because *kinoo* 'yesterday' is also a part of the presupposition. However, I am neutral as to whether or not *kinoo* 'yesterday' is a topic in the same sense as *ore* '1sg'. It belongs to the category of stage topic discussed in 3.3.3.In this study I restrict myself to investigating elements which constitute arguments of sentences and do not discuss stage topics in detail.

In grammaticality judgements, contexts will be provided in order for topics to be typical topics (presupposed, definite, etc.) and for foci to be typical foci (asserted, indefinite, etc.). Examples of contexts which prompt different focus structures are provided in (29) to (31), where the target expression is *koinu(-o) yuzut-ta* 'gave a/the puppy'.

	- A: sooieba by.the.way [**koinu**] puppy [**yuzut-ta**] -yo give-past-fp 'By the way, (I) gave the puppy (to somebody).'
	- A: kinoo-wa yesterday-*wa* [**koinu** puppy **yuzut-ta**] -yo give-past-fp 'Yesterday (we) gave a puppy.'
	- Q: What did you give to him?
	- A: [**koinu-o**] puppy-*o* [**yuzut-ta**] -yo give-past-fp '(I) gave the/a puppy.'

In predicate-focus contexts like (29), the referent of the discourse element in question has typically already appeared in the context preceding the target expression; in this example, *koinu* 'puppy' has appeared in the context and the speaker and the hearer share the identity of the puppy. Therefore, *koinu* 'puppy' is easily presupposed and is interpreted as topic. The speaker intends to tell the

### 3 Framework

hearer what happened to the puppy because this piece of news is not shared with her. The readers may wonder why I do not simply use a question like 'what happened to the puppy?', which typically prompts a predicate-focus structure. This question, however, strongly favours omitting the element *koinu* 'puppy', since it appears in the immediate context. This is the reason why the context which prompts a predicate-focus structure like (29) appears to be complicated.

In sentence-focus contexts like (30), on the other hand, the referent is typically not shared; in (30-A), *koinu* 'puppy' appears out of the blue. The whole utterance is interpreted as news or focus. In this case, A of (30) can be easily preceded by questions like 'what happened yesterday?'.

Argument-focus contexts like (31) are typically *what*- or *who*-questions that prompt a single argument as an answer. In (31), the answer is *koinu* 'puppy', while 'A gave (something)' is presupposed.

### **3.4.2 Assumptions in experiments**

In the production experiments, I asked Japanese native speakers to read aloud sentences preceded by different contexts, which prompt different types of focus structures in the sentences. These contexts are designed in the same way as discussed in the last section.

### **3.4.3 Corpus annotation and analysis**

In analyzing spontaneous speech, it is relatively difficult to apply the definitions of topic and focus discussed above, since clean contexts are not available, in contrast to constructed examples. For this reason, I will provide definitions of topic and focus for the corpus investigation based on the assumptions concerning these notions discussed in §3.3. The basic idea is that, since it is difficult to determine whether some discourse referent is presupposed or not, it is possible to use information status to approximate the given-new taxonomy (§3.4.3.3) of the referent, instead of using the *presupposed* vs. *asserted* distinction. The activation status of the referent in question is approximated by whether the referent has an antecedent or not.

Firstly, I will discuss the characteristics of the corpus (§3.4.3.1) and the procedure used in the annotation anaphoric relations (§3.4.3.2). Then the annotation of relevant features will be discussed (§3.4.3.3).

3.4 Methodology

### **3.4.3.1 Corpus**

This study investigates 12 core data of simulated public speaking from *the Corpus of Spontaneous Japanese* (CSJ: Maekawa 2003; Maekawa et al. 2004). The data list and basic information are summarized in Table 3.1. The data to be investigated are randomly chosen out of 107 core data of simulated public speaking. Simulated public speaking is a type of speech where the speakers talk about everyday topics such as 'my most delightful memory' or 'if I lived in a deserted island'. I use the RDB version of CSJ (Koiso et al. 2012) to search the corpus.


Table 3.1: Corpus used in this study

The core data of CSJ has rich information of various kinds. I used the information in (32) to generate the information relevant for this study.

### (32) a. Utterance time


Relevant variables will be explained in each section.

### **3.4.3.2 Annotation of anaphoric relations**

The information on anaphoric relations is used to identify topics and foci. Anaphoric relations are identified as described below, following basic procedures have been proposed in Iida et al. (2007) and Nakagawa & Den (2012).

### 3 Framework

	- b. **Classification of discourse elements**: Discourse elements are classified into categories based on what they refer to.
	- c. **Identification of anaphoric relations**: The link between the anaphor and the antecedent is annotated.

First, I identified the grammatical function of clauses (a in (33)), namely A, S, vs. P. This is necessary in order to determine the discourse elements and zero pronouns to be investigated. In Japanese, pronouns such as *watasi* '1sg', *anata* '2sg', and *kare* '3sg' are rare; the most frequent pronoun is the zero pronoun. In (34), for example, the speaker indicated by Ø and 'the dog' indicated by Ø are zero pronouns, and are assumed to appear immediately before the predicates. As shown in (34-d), two zero pronouns Ø and Ø can appear in the same clause; still, native speakers have no trouble in understanding the utterance.

	- b. aa fl zutto all.the.time kono this inu -to dog-with issyoni together eii fl Ø Ø sun-de live-and '(I) lived with this dog all the time.'
	- c. sikamo moreover oo fl tabi-o travel-*o* Ø Ø suru do toki-mo time-also 'Moreover, also when (I) travel,'
	- d. kuruma-ni car-loc Ø Ø Ø Ø nose-te put-and '(I) put (the dog) in my car.'
	- e. ee fl amerika-o America-acc tabi travel Ø Ø si-ta-to do-past-q '(I) traveled America.' (S02M1698: 182.88-195.87)

I identified 7697 discourse elements (5234 NPs, 655 overt pronouns, and 1808 zero pronouns) from the corpus.

Second, I classified discourse elements into 13 categories depending on what they refer to (b in (33)): common referent, connective, speaker, hearer, time, filler, exophora, question, quantifier, degree word, proposition, and other more. Although there are many categories, only common referents are relevant for the purpose of this study. The other categories were annotated for future studies. Also, I limit my analyses to A, S, P, and Ex (to be discussed below). Datives are

3.4 Methodology

also added for comparison. This process leaves us with 2301 elements (1662 NPs, 80 overt pronouns, and 559 zero pronouns). However, I occasionally use data which include other kinds of elements for detailed analysis.

Third, I identified the anaphoric relation for each discourse element (c in (33)). A unique ID number is given for the set of discourse elements which refer to the same entity. In (35), for example, *syoo-doobutu* 'a small animal' in line a, and *Ø* in line c, e, and f, all refer to the small animal introduced in line a. All of them are given ID number 1 because they refer to the same entity. The element *syoodoobutu* 'a small animal' is called the **antecedent** of the **anaphor** *Ø* in line c. In the same way, the element *Ø* in line c is the antecedent of the **anaphor** *Ø* in line e. The element *watasi* refers to another entity, the speaker, and is given another ID number, namely 2.


(S00F0014: 619.51-631.71)

Using anaphoric relations and other information from the corpus, I generated other relevant features to be discussed in the next section.

### **3.4.3.3 Annotation of topichood and focushood**

**3.4.3.3.1 Approximation to the given-new taxonomy** The status of a referent in the given-new taxonomy is approximated by whether the expression referring to the referent has an antecedent or not. An expression that has an antecedent is called an **anaphoric** element, while an expression that does not have an antecedent is called a **non-anaphoric** element. I use the term information status

### 3 Framework

to refer to the status of a referent that is anaphoric or non-anaphoric. Note that the terms anaphoric vs. non-anaphoric are used in Chapter 4, 5, and 6 only to refer to corpus counts. The referent of an anaphoric element is assumed to be either evoked or declining in terms of the given-new taxonomy, and active or semi-active in terms of activation status. On the other hand, the referent of a non-anaphoric element is inferable, unused, or new in terms of the given-new taxonomy, and semi-active or inactive in terms of activation status. I prefer to use the terms of the given-new taxonomy over those related to activation status, since they are more fine-grained. The correspondence among activation statuses, the given-new taxonomy, and corpus annotations are shown in Table 3.2. The distinction between inferable, declining, unused, and brand-new is judged manually when necessary. By "shared", I mean the referent is evoked, declining, inferable, or unused in terms of the given-new taxonomy.

Table 3.2: Activation status, the given-new taxonomy, and corpus annotation


**3.4.3.3.2 Grammatical function** Following Comrie (1978) and Dixon (1979), I distinguish S, A, and P as grammatical functions. S is the only argument of an intransitive clause, A is the agent-like argument of a a transitive clause, and P is the patient-like argument of transitive clause. For now, I simply distinguish A and P based on whether the argument in question is or can be coded by *ga* or *o*. When it can be coded by *ga*, it is A; when it can be coded by *o*, it is P. Furthermore, I sometimes distinguish agent S and patient S if needed.

In addition to S, A, and P, I identify non-argument elements (Ex). Non-argument elements are those which appear to be part of the clause but do not have direct relationships with the predicate. A typical example is shown in (36).

(36) **zoo-wa** elephant-*wa* hana-ga nose-*ga* nagai long

3.5 Summary

'The elephant, the nose is long (The elephant has a long nose).' (Mikami 1960)

As exemplified in (36), the element *zoo* 'elephant' is considered to be Ex. *Hana* 'nose' is the only argument of the predicate (S), and *zoo* 'elephant' does not have direct relationships with the predicate *nagai* 'long'; still, *zoo* 'elephant' looks like part of the clause and needs a label, which happens to be "Ex".

Although Ex is frequently coded by so-called topic markers such as *wa* and *toiuno-wa*, *wa*- and *toiuno-wa*-coded elements are not always labelled as Ex. If they are considered to be S, A, or P, they are labelled as such. For example, in the case where *hana* 'nose' is coded by *wa* as in (37), *nose* is labelled as S, instead of Ex.

(37) zoo-no elephant-gen hana-wa nose-*wa* nagai long 'The elephant's nose is long.'

**3.4.3.3.3 Other features** Ideally, one should annotate all the variables proposed in (2), but this has been impossible, partly due to time and labor limitations, and partly due to the lack of clear criteria to annotate them consistently. For example, definiteness and specificity are difficult to annotate consistently. Multiple annotators are needed for reliable and objective analyses. Animacy could be simpler, but I have not annotated this feature throughout the corpus due to the time and labor limitations. The previous literature indicates that these features play only a small role in Japanese grammar. These features will be discussed when necessary.

### **3.5 Summary**

In this chapter, I discussed the framework employed in this study and the method of corpus annotation and analysis. In the next three chapters, different aspects of spoken Japanese grammar (i.e., particles, word order, and intonation) will be analyzed based on the framework and methodology discussed in this chapter.

# **4 Particles**

### **4.1 Introduction**

In this chapter, I will describe the so-called topic particles coding different kinds of topics (§4.2), and the so-called case particles coding different kinds of foci and grammatical functions (§4.3). Table 4.1 summarizes these particles according to whether they code topic or focus in different statuses of the given-new taxonomy. As clarified earlier, I mainly use the terms of the given-new taxonomy, but the activation status is also specified in the table to show the correspondences between the two classifications. The shaded cells indicate that they are indistinguishable from each other in the annotation proposed in §3.4. Different topic particles attach to elements in different statuses of the given-new taxonomy, while case particles are not sensitive to the given-new taxonomy. Instead, case particles are sensitive to the grammatical functions and the broad vs. narrow focus distinction, as summarized in Table 4.2. The morpheme cop indicates the copula.

Table 4.1: Topic particle vs. activation status and the given-new taxonomy


I argue that these tables constitute a semantic map (Croft 2001; Haspelmath 2003). By arguing this, I postulate that the scales of the given-new taxonomy (represented by the columns) and the topic vs. focus distinction (represented by the rows) in Table 4.1 are cognitively real and continuous in the way they are ordered in the tables. The same applies to the contrast vs. non-contrast distinction (rows) and the grammatical function (columns) in Table 4.2. This argument and the Semantic Map Connectivity Hypothesis (1) in §3.2 lead us to our hypothesis in (1).

### 4 Particles

Table 4.2: Case particle vs. grammatical function


(1) **Semantic Map Connectivity Hypothesis of Information Structure**: Since the scales of the given-new taxonomy and the topic vs. focus distinction in Table 4.1 and the contrast vs. non-contrast distinction and the grammatical function in 4.2 are cognitively continuous, the particles map onto a connected region in the conceptual space.

The semantic maps in Table 4.1 and 4.2 support the hypothesis in (1), because all of the particles are in connected regions. In the following sections, I will show the details of the distribution of these particles with specific examples.

### **4.2 So-called topic particles**

As shown in Table 4.1, evoked elements are coded by *toiuno-wa* or *wa*, while inferable elements are coded by *wa*. Declining and unused elements are coded by a copula followed by *kedo* 'though' or *ga* 'though'. The zero particle (indicated by Ø) can code elements in the given-new taxonomy. The statuses in the givennew taxonomy have corresponding activation statuses in the hearer's mind as assumed by the speaker. I propose that inferable and declining elements as well unused and brand-new elements are in different activation statuses in the assumed hearer's mind.

Table 4.3 and Figure 4.1 show the distributions of elements in different information statuses coded by different particles in our corpus. Overall, the topic particles *toiuno-wa* and *wa* code a higher ratio of anaphoric elements than the case particles *ga* and *o*. The particles *mo* and *ni* are included here for comparison. In the corpus, the markers *wa*, *toiuno-wa*, and *mo* are the most frequent topic markers and *ga*, *o*, and *ni* are the most frequent case markers (excluding *no* 'gen', not included here). Note that "anaphoric" in the present work just means that the element in question has a co-referential antecedent and "non-anaphoric" means

### 4.2 So-called topic particles

that it does not. Elements with bridging antecedents are categorized as "nonanaphoric." See §3.4.3.2 for details on the annotation procedure. A linear mixed effects model was employed to predict information status.<sup>1</sup> I included particles (*toiuno-wa, wa, mo, ga, o, ni*), word order (nth in CSJ, see §5.1 for the definition of this annotation), and intonation (phrasal vs. clausal IU, see §6.1 for the definitions) as fixed effects, and speakers (TalkID) as random effect. The model with the effect of particles, word order, and intonation is significantly different from the model without each of those effects (likelihood ratio test, < 0.001 for the model without particles, < 0.01 for one without word order, and < 0.05 without intonation).<sup>2</sup> The least-squares mean for each level of the particles was calculated, and pairwise comparisons among particles were conducted. The results of this pairwise comparison is shown in Table 4.4, which only includes the pairs of interest and those with p-values of less than 0.5. <sup>3</sup> The contrast of *ga* − *o*, whose estimate is −0.465, indicates that the least-squares mean of the odds ratio of anaphoric elements coded by *ga* is significantly lower than the least-squares mean of the odds ratio of those coded by *o*; in other words, anaphoric elements are more likely to be coded by *o* than by *ga*. Similarly, anaphoric elements are more likely to be coded by *wa* than by *ga*, *ni*, or *mo*. The difference between the particles *o* and *wa*/*toiuno-wa* is not statistically significant. As will be discussed in 4.4.2, this is because *wa* (and presumably *toiuno-wa*) prefers to code anaphoric As over anaphoric Ps. Further, the difference between *toiuno-wa* and *ga* is not statistically significant because of the effect of intonation; most of the *toiuno-wa*-coded elements are in phrasal IUs (see Chapter 6).

The statistical analysis shows that *toiuno-wa* codes as high a ratio of anaphoric elements as *wa*. However, the detailed qualitative analysis in §4.2.1 reveals that the referents of *toiuno-wa*-coded elements are in fact evoked: the referent of nonanaphoric elements coded by this particle has been introduced implicitly in the previous context. On the other hand, the referent of *wa*-coded elements have not necessarily been introduced in the previous context, they can be inferable elements. The zero marker Ø does not appear frequently enough in the corpus because CSJ consists of formal speech. As has already been pointed out in Tsutsui (1984) and discussed in §2.4.2.7, zero markers tend not to appear in formal speech. There are not enough examples of the copula followed by *ga* or *kedo* (7 examples), and therefore I refrain from generalizing based on this small amount

<sup>1</sup> I used R for the statistical analysis of the study. https://www.r-project.org The packages lme4 and lsmeans were employed.

<sup>2</sup>The effects of word order and intonation will be discussed in Chapters 5 and 6, respectively.

<sup>3</sup>The p-values are adjusted using the Tukey method for comparing a family of multiple estimates.

### 4 Particles

of data. Instead, I will employ grammatical judgements and analyze these examples qualitatively, a procedure which is also supported by observations in the previous literature.

I also calculated the persistence of each element. Persistence, which is proposed in Givón (1983) to measure topichood, is the number of times the referent is mentioned after it is mentioned by the expression in question. The persistence of elements followed by different particles is shown in Table 4.5. The table shows the count of persistent and non-persistent elements; persistent elements are mentioned at least once in the discourse following its mention, while non-persistent elements are not mentioned in the following discourse. See §3.4.3.2 for the annotation procedure. A linear mixed effects model was applied to predict persistence (persistent vs. non-persistent). I used particles (*toiuno-wa, wa, mo, ga, o, ni*), word order (nth in CSJ), and intonation (phrasal vs. clausal IU) as fixed effects and speakers (TalkID) as a random effect. The model with the effects of particles, word order, and intonation is significantly different from the model without either the effect of particles or that word order (likelihood ratio test, < 0.001 for the model without particles, < 0.01 for the model without word order). However, the model with the effects of particles, word order, and intonation is not significantly different from the model without the effect of intonation ( = 0.423). The least-squares means were calculated, and pairwise comparisons among particles were conducted. The results of these pairwise comparisons are shown in Table 4.6, which only includes the pairs of interest and those whose p-values are less than 0.5. Although the effect of particles is significant, this effect appears to come mainly from the contribution of *ni* in contrast with *toiuno-wa*, *wa*, and *o*, which is not of interest in the present work. One notable contrast is the effect of *toiuno-wa* in contrast to *ga*. The result suggests that *toiuno-wa* is more likely to code persistent elements than *ga*. Figure 4.2 shows how many times the referent in question is mentioned after the NPs or pronouns coded by each particle were mentioned. Numbers more than or equal to 5 are compressed as "5+".

Elements coded by so-called topic markers cannot be repeated as news, as shown in the hypothetical conversation between A and B in the following examples. As in (2) and (3), the *toiuno-wa*-coded elements *mooningu thii* 'morning tea'<sup>4</sup> and *eberesuto-kaidoo* 'the Everest Trail' cannot be repeated as news, while the case-marker-coded elements *kootya-ka koohii-ka* 'tea or coffee', *tibetto* 'Tibet', *nepparu* 'Nepal', and *kooeki-ro* 'trading road' can be repeated as news.

<sup>4</sup>As discussed in §4.2.1, there are some formal variations of *toiuno-wa*; *tteno-wa* is one of these variations.

Figure 4.1: Particle vs. information status (ratio)

Figure 4.2: Particle vs. # of mention (ratio)

### 4 Particles


Table 4.3: Particle vs. information status

Table 4.4: Results of pairwise comparison among the least-squares means (information status)


(0 ≤ '\*\*\*' ≤ 0.001 ≤ '\*\*' ≤ 0.01 ≤ '\*' ≤ 0.05 '.' ≤ 0.1 ≤ ' ' 1)

Table 4.5: Particle vs. persistence



Table 4.6: Results of pairwise comparison among the least-squares means (persistence)

(0 ≤ '\*\*\*' ≤ 0.001 ≤ '\*\*' ≤ 0.01 ≤ '\*' ≤ 0.05 '.' ≤ 0.1 ≤ ' ' 1)


### 4 Particles

As shown in (4), the element *thii-taimu* 'tea time', coded by the copula + *kedo*<sup>5</sup> , and the *wa*-coded element *takai tokoro* 'places of high elevation' cannot be repeated as news, while the *ga*-coded elements can.

	- 'water is very important.' (S01F0151: 339.78-349.56)
	- B: hee, {**??thii-taimu**/**??takai tokoro-de**/**kikennna kanoosee-ga**/**mizu-ga**} Oh, {tea time/on places of high elevation/the possibility of danger/water}

As indicated in Table 4.1, and as will be discussed below, brand-new elements can never be coded by topic markers; they can never be assumed to be shared between the speaker and the hearer. Non-anaphoric elements coded by topic markers are inferable, declining, or unused, as will be discussed in the following sections. For example, it is unacceptable for topic markers to code brand new elements *oozei-no hito* 'many people' out of the blue, as shown in (5).

(5) \***oozei-no** many-gen **hito-wa** person-*wa* paathii-ni party-dat ki-masi-ta come-plt-past 'Speaking of many people, they came to the party.' (Kuno 1973b: 45)

Similarly, it is unacceptable for other topic markers to code these elements, whereas *ga* can code them.

(6) **oozei-no** many-gen **hito-{??toiuno-wa/??da-kedo/??Ø/ga}** person-{*toiuno-wa*/cop-*though*/Ø/*ga*} paathii-ni party-dat ki-masi-ta come-plt-past 'Many people came to the party.'

<sup>5</sup>Again there are some variations of this marker and I will discuss this in §4.2.3.

4.2 So-called topic particles

While *oozei-no hito* 'many people' in (6) was unanchored in terms of Prince (1981), *taroo-no otoosan* 'Taro's father' in (7) is anchored. The element coded by a topic marker is still not acceptable in an out-of-the-blue context.

(7) a! oh! **taroo-no** Taro-gen **otoosan-{??toiuno-wa/??wa/??da-kedo/Ø}** father-{*toiuno-wa*/*wa*/cop-*though*/Ø} asoko-de there-loc tabako cigarette sut-teru-yo smoke-prog.plt-fp 'Taro's father is smoking over there.'

Therefore, topic markers in Japanese are sensitive to the given-new taxonomy rather than to definiteness and identifiability.<sup>6</sup>

Finally, as will be discussed in detail in §4.2.4, an element coded by a zero particle (Ø) that precedes other arguments and is uttered with a coherent intonation contour cannot be repeated as news, and is hence be presupposed to be shared knowledge.

	- Y: **nezumi-Ø** nezumi-Ø neko-ga cat-*ga* tukamae-ta-yo catch-past-fp 'The cat caught (the) mouse.'
	- H: hee, {??nezumi, neko(-ga)} Oh, {mouse, cat(-*ga*)} (=(8) in §3.3.1)

In the following sections, I analyze each topic marker in detail.

### **4.2.1** *Toiuno-wa*

In this section, I will show that *toiuno-wa* codes elements whose referents are evoked through the explicit or implicit introduction of the elements or through their availability in the universe of discourse.

There are several phonetic variations of *toiuno-wa*: *(t)teno-wa*, *t(y)uuno-wa*, *teiuno-wa*, etc. I put them into the same category as *toiuno-wa* and assume that they are stylistic variants of the same particle.

<sup>6</sup> I suppose that the zero particle is acceptable because the zero particle in this case is ambiguous between topic and focus coding.

### 4 Particles

### **4.2.1.1 Evoked elements tend to be coded by** *toiuno-wa*

*Toiuno-wa* typically codes evoked elements. As exemplified in (9) and (10), the antecedents of *toiuno-wa*-coded elements, *un* 'luck' in (9) and *tiryoo-hoo* 'treatment methods' in (10), are mentioned in the immediately preceding contexts.


Non-anaphoric elements coded by *toiuno-wa* are considered to be evoked through the implicit introduction of an element or by the physical context. In (11), *supootukansen* 'sport watching' is non-anaphoric, but the speaker mentioned that he watched a world title match. Thus, 'sport watching' is considered to be evoked when the speaker mentioned 'sports watching' with the *toiuno-wa*-coding in line c.

(11) a. ee fl sekai-taitoru-sen-o-desu-ne world-title-fight-*o*-plt-fp ee fl terebi-de TV-by mi-masi-ta watch-plt-past '(My friend and I) watched a world title match on TV.' b. ...

c. watasi-zisin 1sg-self gu frg -wa -*wa* ee fl amari not.really koo fl **supootu-kansen-teiunowa** sport-watching-*toiuno-wa* tyotto fl si-nakat-ta-n-desu-ne do-neg-past-nmlz-plt-fp 'I myself hadn't watched any kinds of sports.' (S01M0182: 52.77-79.62)

Similarly, in (12), *taitoru* 'title (in piano competitions)' is a non-anaphoric element, but the speaker was talking about 'awards' in the preceding context and 'title' can be considered to have been evoked at the time of utterance (12-e).

	- b. So far the best award I received was the fourth best place in the China-Japan International Competition.
	- c. Beyond that, I would like to receive higher awards.
	- d. ano fl doositemo anyhow kore-wa this-*wa* yappari anyway piano-o piano-*o* kokorozasu orient mono-ni people-for totte-wa in.terms.of-*wa* 'This, for those who want to make name as a pianist,' e. kono this **taitoru-tteiuno-wa** title-*toiuno-wa* sugoku very ookii-node big-because 'titles matter a lot, so...' (S00F0209: 507.13-529.76)

In other cases, as in (13), *toiuno-wa*-coded elements are considered to be evoked through "common sense". (13) is the beginning of the talk but the speaker mentions *ningen* 'human being' with *toiuno-wa*-coding. This is because people can always talk about human beings even in out-of-the-blue contexts. Therefore, "human beings" are always available as topics. *Tuuno-wa* is a variation of *toiuno-wa*.

(13) **ningen-tuuno-wa** human-*toiuno-wa* hizyooni very ano fl umaku well deki-teru created-pfv doobutu-da-to animal-cop-quot omoi-masu-ne think-plt-fp 'I think that human beings are well-created.' (S02M1698: 6.99-11.00)

Readers might think that (13) is acceptable because 'human being' is generic rather than evoked in the physical context. However, I do not employ this account for the following two reasons: (i) being generic is a characteristic across

### 4 Particles

all *toiuno-wa*-coded elements (see §4.2.1.3), and (ii) even though the elements are generic, some elements have difficulties being coded by *toiuno-wa* at the beginning of discourse. Let us discuss example (14), which is at the very beginning of a speech about travel to Hawaii.

(14) teema-wa theme-*wa* hawai-too-no Hawaii-island-gen sizen-no nature-gen subarasisa-to splendor-and tabi-no travel-gen tanosisa-nituite-desu fun-about-cop 'The topic (of this talk) is the splendor of Hawaii's nature and the fun of traveling.' (S00F0014: 0.30-6.08)

In this example, the speaker did not choose to code 'the splendor of Hawaii's nature and the fun of traveling' with *toiuno-wa*. It is harder to code this with *toiuno-wa* than 'human being' because it is not always available as a topic even though 'the splendor of Hawaii nature and fun of traveling' is generic. Therefore, I argue that the acceptability of the *toiuno-wa*- coded 'human being' without introduction of human beings in (13) is possible because it is always available as topic, not because it is generic.

### **4.2.1.2 Declining or inferable elements tend not to be coded by** *toiuno-wa*

There are a few examples where *toiuno-wa* codes inferable elements. In (15), the speaker explains why she came to Iran and describes the middle school there. The climate in Iran has not been mentioned before (15-c), but is still coded by *toiuno-wa*. The climate in Iran is neither implicitly introduced nor available as a universal topic.

	- c. eeto fl iran-no Iran-gen **kikoo-tteiuno-wa** *climate-toiuno-wa* tomokaku at.any.rate kansoo dry si-tei-masi-te do-prog-plt-and 'Uh, the climate in Iran was very dry...' (S03F0072: 178.31-181.65)

Similarly, in (16-c), the speaker is going to talk about a dog his family kept. The speaker begins with the explanation why the dog came to his house. The element *keei* 'background (of why the dog came)' is coded by *toiuno-wa*, although *keei* has not been explicitly mentioned in the preceding context.

4.2 So-called topic particles

	- b. (After the death of their previous dog, the dog he is going to talk about joined his family.)
	- c. e fl uti-ni home-to ki-ta come-past **keei-toiuno-wa** background-*toiuno-wa* 'The background of how the dog came to our house is'
	- d. ma fl sono that zyuui-san-no vet-hon-gen syookai-nan-desu-keredomo introduction-nmlz-cop.plt-though '(through) the introduction of that vet...' (S02M0198: 141.97-146.92)

On the other hand, there are some cases where it is unnatural for *toiuno-wa* to code inferable elements. For example, in (17-c), the element *hikoozyoo* 'airport' cannot naturally be coded by *toiuno-wa*, which is originally coded by *wa*. The airport is inferable because the speaker has already mentioned flying to Lukla.

	- b. From that village, we started trekking.
	- c. sono that rukura-no Lukla-gen mura-nan-desu-ga village-nmlz-plt-though 'Regarding that Lukla village,'
	- d. **hikoozyoo-{wa**(/??-**toiuno-wa**)} airport-*wa*(/-*toiuno-wa*) hontooni really yama-no mountain-gen naka-ni inside-in ari-masi-te exist-plt-and 'the airport is really in a mountainous area.' (S01F0151: 179.50-191.39)

I speculate that the differing acceptability of *toiuno-wa* in (15), (16), and (17) is due to the fact that the elements in question have different statuses in the given-new taxonomy or in their accessibility; 'the climate' in (15) and 'the background' in (16) are more general terms and are more easily accessible than 'the airport' in (17). Note that this does not contradict, but is rather consistent with, the Semantic Map Connectivity Hypothesis (1). Since the given-new taxonomy scale is continuous, the boundary between evoked and inferable is blurred, and among the inferable elements in these examples, 'the climate' of Iran in (15) and 'the background' in (16) are easier to access than 'the airport' in (17). This is consistent with the nature of the conceptual space, although the boundary is drawn clearly in the semantic map in Table 4.1 for the purpose of presentation.

### 4 Particles

It is unnatural when *toiuno-wa* codes declining elements. The degree to which a referent is declining is difficult to calculate from the corpus. Apparently, it does not simply correspond to the distance between an element and its antecedent; rather, the intervention of (an)other topic(s) seems to be more relevant. For example, a copula followed by *kedo* codes declining or unused elements, as will be shown in §4.2.3. In (18-g), it codes a declining element rather than an unused element, since the element in question has already been introduced in line a. In line a, two potential topics 'fame' and 'work' are introduced. The speaker talks about 'fame' first and moves on to 'work' in line g. It is fair to assume that the topic 'work' is intervened by another topic 'fame'. When the element 'work' is retrieved as a current topic in line g, it is coded by a copula followed by *keredomo* 'though', a variation of *kedo*. However, this marker cannot be replaced with *toiuno-wa*.

	- b. Concerning fame,
	- c. I have been participating in various piano competitions
	- d. So far the best award I received was the fourth best play in the China-Japan International Competition.
	- e. Beyond that, I would like to receive higher awards.
	- f. Titles matter a lot for pianists, so I will work hard.
	- g. de then ato-wa remaining-*wa* **sigoto-no** job-gen **bubun-{nan-desu-keredomo/(??toiuno-wa)}** part-{nmlz-cop.plt-though/*toiuno-wa*} 'Concerning the other one, work,'
	- h. to receive higher wages...

*Toiuno-wa* cannot code elements that have not been established as topics. In (19), although 'tea time' is introduced in line b, it does not appear to be established enough to be topic, which makes *toiuno-wa* unnatural in line d; the original marker is a copula followed by *keredomo*.

	- b. in addition, there is tea time and we can take a break while we climb the mountain,
	- c. so, we walked without feeling that we were in a big group.

These subtle differences in the acceptability of *toiuno-wa* cannot be captured simply by counting numbers. However, they are clear from the acceptability judgements.

Unused elements also cannot be coded by *toiuno-wa*. It is very difficult to find unused elements because of the nature of our corpus; each speaker gave a speech in front of people s/he does not know, and there were only few things the speaker could assume to share with the hearer(s). However, constructed examples like (20) clearly show that *toiuno-wa* cannot code unused elements.

	- A: asita-no tomorrow-gen **paathii-{da-kedo/??toiuno-wa}** party-{cop-though/*toiuno-wa*} nan-zi-kara-na-no what-o'clock-from-cop-q 'What time does tomorrow's party start?'

Note that if the element 'party' has already been introduced into the discourse, *toiuno-wa* can code it. This is shown in (21-A).<sup>7</sup>

(21) Context: A and B are having a conversation. B mentioned a party taking place on the following day, and A knows that both A and B are going to go.

<sup>7</sup> In this example, I am using *tteiuno-wa* instead of *toiuno-wa* simply because this hypothetical utterance is casual; *tteiuno-wa* is more casual than *toiuno-wa*. *Toiuno-wa* sounds too formal in this utterance.

### 4 Particles

A: sono that **paathii-{??da-kedo/tteiuno-wa}** party-{cop-though/*toiuno-wa*} nan-zi-kara-na-no what-o'clock-from-cop-q 'What time does tomorrow's party start?'

### **4.2.1.3 Further characteristics of** *toiuno-wa***-coded elements**

Statements about *toiuno-wa*-coded elements tend to represent the general characteristics of the referents, as has been pointed out in Masuoka (1987; 2008a). Masuoka argues that *toiuno-wa*-coded elements only accompany individual-level predicates (property predicates in his terminology). This is clearly shown in the contrast between (22-a) and (22-b) (repeated from (55) in §2.4.2.5). Whereas the stage-level predication in (22-a) does not allow *toiuno-wa*, the individual-level predication in (22-b) does allow *toiuno-wa*.


In our corpus, most examples of *toiuno-wa* also accompany individual-level predication rather than stage-level predication. In (23), the speaker is talking about the general characteristics of puppies.

(23) **koinu-toiuno-wa** puppy-*toiuno-wa* dono which syurui-demo kind-also hizyooni very ano fl neru-no-ga sleep-nmlz-*ga* tokui-desu-ne good.at-cop.plt-fp 'Puppies are, no matter what kind, good at sleeping.' (S02M1698: 166.62-170.59)

The explanation for this requires further investigation.

### **4.2.2** *Wa*

*Wa* codes inferable elements in addition to evoked elements. Overall, the referents of *wa*-coded elements are assumed to be borne in the hearer's mind at the time of utterance; alternatively, they can easily be accommodated to this assumption.

### **4.2.2.1 Evoked and inferable elements tend to be coded by** *wa*

As exemplified in the following examples, *wa* can code evoked elements. In (24), 'chelow kebab' is mentioned in line a, and it is mentioned again in lines b and g. The second and the third mention of this element are coded by *wa*.

(24) a. There is a dish called chelow kebab. b. de and **sore-wa** that-*wa* eeto fl gohan-ni rice-to eeto fl bataa-o butter-*o* maze-te mix-and 'That, you mix rice with butter...' c. on top of that you put spice, d. on top of that you put mutton, e. you mix it and eat it. f. There were many dishes of this kind. g. *sore-wa* that-*wa* kekkoo to.some.extent sonnani not.really hituzi-no sheep-gen oniku-no meat-gen kusasa-mo smell-also naku-te not.exist-and 'It did not have smell of mutton...' h. I thought it was delicious. (S03F0072: 446.03-471.72)

Also in (25), 'the result of the medical exam' is mentioned in line b, and it is mentioned again in line c coded by *wa*.


### 4 Particles

Unlike *toiuno-wa*, *wa* also extensively codes inferable elements. In (26), line a, *nyuusya* 'admission to a company' triggers *siken* 'exam' in line c, which is naturally coded by *wa*.

(26) a. ee fl toaru certain ryokoo-sya-ni travel-company-dat ano fl itioo tentatively nyuusya admission kimari-masi-ta decide-plt-past 'A certain travel company admitted me to work there.' b. ... c. hizyooni very **siken-wa** exam-*wa* muzukasikat-ta-to difficult-past-quot ima-mo now-also oboe-teori-masu remember-prog-plt '(I) still remember that the exam was very hard.' (S01F0038: 231.34-241.96)

*Wa* sometimes forces the hearer to accept the assumption that the s/he has already been thinking about the *wa*-coded referent, a phenomenon which I call accommodation. In (27), which is the continuation of the conversation in (26), the *wa* that codes *gyappu* 'gap' in line c forces the hearer to accept the assumption that s/he expected the speaker to talk about the gap between expectation and reality.

(27) a. tada but soko-kara that-from saki-wa ahead-*wa* ano fl dono which sigoto-mo job-also soo-da-to so-cop-quot omou-n-desu-ga think-nmlz-plt 'But, after the admission, I guess this is the same in all kinds of jobs,' b. yume-to dream-and genzitu-tte reality-quot iu-n-desu-ka call-nmlz-plt-q 'people might call it (the difference between) dream and reality,' c. **gyappu-wa** gap-*wa* kanari very ari-masi-te exist-plt-and

'there was a gap (between what I expected and reality).' (S01F0038: 265.11-270.98)

In cases like (26) and (27), some hypothetical speakers might have chosen to use *ga* instead of *wa*, while *wa* cannot be replaced by *ga* to code evoked elements in (24) and (25). If the elements in (26) and (27) were coded by *ga*, they would

### 4.2 So-called topic particles

not force the hearer to accommodate the assumption that s/he has already been thinking about them.

What is inferable and what not depends on the culture. In Japanese culture, apartments might come with household appliances such as a washing machine, but not with livestock. Therefore, in (28-b), coding *sentaku-ki* 'washing machine' with *wa* sounds natural, while in (28-b′ ), coding *hituzi* 'sheep' with *wa* sounds odd, as if the speaker assumed that it is common for a room to come with a sheep – whereas it is too difficult to accommodate this assumption.

	- b. **sentaku-ki-{wa/ga}** washing-machine-{*wa/ga*} tui-te-ta-yo come.with-prog-past-fp '(The room) comes with a washing machine.'
	- b ′ . **hituzi-{??wa/ga}** hituzi-{*wa/ga*} tui-te-ta-yo come.with-prog-past-fp '(The room) comes with a sheep.'

Note that *ga*-coding is acceptable in both cases because *ga* can code new elements.

Kuroda (1972) and Kuno (1973b) argue that generic NPs are always available as topics and can be always coded by *wa*. However, as I have discussed in §4.2.1, not all generic NPs are available as topics. Kuno's examples like (29) may be natural at the beginning of speech.

(29) kuzira-**wa** whale-top honyuu-doobutu-desu mammal-animal-cop.plt 'Speaking of whales, they are mammals. (A whale is a mammal.)' (Kuno 1973b: 44)

People can expect the speaker to start talking about *kuzira* 'whales' out of the blue. However, it is difficult to expect the speaker to talk about the "Kosovo War" (S00M0199) or about "Himalaya trekking" (S01F0151). Therefore, these NPs are not naturally coded by *wa* out of the blue even when they are in generic statements, since they are not available as topics and are difficult to accommodate. The speakers would choose other forms to introduce these NPs, to then explain them in more detail in generic statements. Out of 12 speeches I studied, there is only one speech (S02M1698) where the speaker begins with a generic statement with *toiuno-wa*, which is (13) above. The speaker begins with a generic statement about human beings in general, and the hearer(s) can easily expect the speaker to start talking about this out of the blue.

### 4 Particles

### **4.2.2.2 So-called contrastive** *wa*

I argue that the so-called contrastive *wa*, which has been discussed extensively in the literature (e.g., Kuno 1973b), is a special case of *wa* coding inferable elements. In typical cases of inferables like (26), the referent of one element (e.g., *nyuusya* 'admission to a company') is explicitly mentioned and the referent of another related element (e.g., *siken* 'exam') is partially evoked, triggered by the element that has been mentioned explicitly; 'the admission' and 'the exam' form a set relevant to the current discourse. Similarly, the elements coded by contrastive *wa* are assumed to belong to a set relevant to the current discourse. In (30), which is slightly modified from (28), *reezooko* 'fridge' and *sentaku-ki* 'washing machine' belong to the same category of 'things expected to come with a room'. The 'fridge' and the 'washing machine' are contrasted in the sense that one is being furnished while the other is not.

(30) a. I'm looking for a new room and yesterday I saw one room.

b. **reezooko-wa** fridge-*wa* tui-te-nakat-ta-kedo come.with-prog-neg-past-though **sentaku-ki-wa** washing-machine-*wa* tui-te-ta-yo come.with-prog-past-fp 'Though (the room) doesn't come with a fridge, (it) comes with a washing machine.'

Note that *wa* coding *hituzi* 'sheep' is still not natural in (31) for the same reason as (28); a sheep is not expected as a normal thing in an apartment.

(31) a. I'm looking for a new room and yesterday I saw one room. b. ??**reezooko-wa** fridge-*wa* tui-te-nakat-ta-kedo come.with-prog-neg-past-though **hituzi-wa** sheep-*wa* tui-te-ta-yo come.with-prog-past-fp 'Though (the room) doesn't come with a fridge, (it) comes with a sheep.'

Similarly, in (32) from our corpus, the *wa*-coded elements *tinomigo* 'infants' and *inu* 'dogs' are contrasted. They belong to the relevant category of 'creatures that might not be allowed to enter restaurants'.

(32) a. de and doitu-toiu Germany-quot kuni-wa nation-*wa* hizyooni very ano fl uu fl inu-ni dog-dat e fl 4.2 So-called topic particles

sumi-yasui live-easy kuni-desu nation-cop.plt 'Germany is a dog-friendly country.'

b. tatoeba for.example aa fl resutoran-de-mo restaurant-at-also anoo fl **tinomigo-wa** infant-*wa* haire-nai-yoona enter.can-neg-such.as resutoran-mo restaurant-also **inu-wa** dog-*wa* haireru-to enter.can-quot 'For example, restaurants where infants are not allowed to get in, uh, dogs can get in.' (S02M1698: 243.46-256.10)

Kuno (1973b: p. 44 ff.) points out that contrastively *wa*-coded elements are not necessarily anaphoric (given), while non-contrastively *wa*-coded elements are. However, there is a problem with this claim. It is possible for non-contrastively *wa*-coded elements to be non-anaphoric – they can be inferable, as we have seen in the previous section. If what Kuno means by "anaphoric" includes bridging anaphora (Clark 1975) and thus includes inferable elements, then contrastively *wa*-coded elements are also anaphoric, because the elements belong to the same category relevant to the current discourse. I argue that the distinction between contrastive and non-contrastive is continuous and a matter of degree; if there are more than two evoked referents in the same category, they tend to be contrastive, while if there is only one element, it is non-contrastive.

### **4.2.2.3 Declining and unused elements tend not to be coded by** *wa*

Declining elements cannot be coded by*wa*. For example, in (18), which is repeated here as (33) for convenience, 'work' is intervened by another topic, 'fame'. When the speaker goes back to 'work', it is not natural for *wa* to code this element ('work').

	- b. Concerning fame,
	- c. I have been participating in various piano competitions
	- d. So far the best award I received was the fourth best play in the China-Japan International Competition.
	- e. Beyond that, I would like to receive higher awards.
	- f. Titles matter a lot for pianists, so I will work hard.
	- g. de then ato-wa remaining-*wa* **sigoto-no** job-gen

### 4 Particles

**bubun-{nan-desu-keredomo/(??-wa)}** part-{nmlz-cop.plt-though/*-wa*} 'Concerning the other one, work,' h. to receive higher wages... (S00F0209: 495.77-539.19)

Similarly, unused elements cannot be coded by *wa*, as the contrast between (34) and (35) shows. The contexts for these examples are repeated from (20) and (21).

(34) Context: According to Facebook, both A and B are going to a party tomorrow. But they have not seen each other for a week. A sees B in a classroom and talks to him/her:

A: asita-no **paathii-{da-kedo/??-wa}** roku-zi-kara-da-yo-ne tomorrow-gen party-{cop-though/*toiuno-wa*} six-o'clock-from-copfp-fp

'Tomorrow's party is from six, right?'

	- A: asita-no tomorrow-gen **paathii-{??da-kedo/-wa}** party-{cop-though/*toiuno-wa*} roku-zi-kara-da-yo-ne six-o'clock-from-cop-fp-fp 'Tomorrow's party is from six, right?'

Although many scholars discuss *wa* based on examples like (36), which appears to be produced out of the blue, they are unnatural in spoken Japanese.

(36) ??anoo fl **toire-wa** bathroom-*wa* doko-desu-ka where-cop.plt-q 'Excuse me, where is the bathroom?'

Assuming that (36) is produced out of the blue without previous mention of the bathroom, the best marker is *Ø*. It seems that, in written Japanese, *wa* can be used to code unused elements as in (37), assuming that this is written Japanese (e.g. text from an e-mail or a letter).

(37) tokorode by.the.way kono this aida interval ohanasi speech si-tei-ta do-prog-past **eega-wa** movie totemo very

4.2 So-called topic particles

omosirokat-ta-desu interesting-past-plt 'By the way, the movie I mentioned the other day was very interesting.'

The spoken Japanese version of (37) is not natural, as shown in (38).

(38) ?a oh kono this aida interval hanasi-te-ta talk-prog-past **eega-wa** movie totemo very omosirokat-ta-desu-yo interesting-past-plt-fp 'By the way, the movie I mentioned the other day was very interesting.'

Formal speech is closer to written Japanese than casual speech and the boundary between them is blurred. Note, however, that the conceptual space is a suitable format to capture variations like this (see Croft 2010).

### **4.2.3 The copula followed by** *ga* **or** *kedo*

A combination of a copula followed by *ga* or *kedo* codes declining or unused elements. As has been mentioned above, there are not many examples of these topic markers in the corpus and I will mainly employ grammatical judgements of constructed and actual examples, and will analyze them qualitatively rather than quantitatively. The results are compatible with the claims in Koide (1984) and Takahashi (1999), supporting the conclusions of this chapter. As discussed in §2.4.2.6, they argue that *ga* newly introduces topics at the beginning of a discourse.

There are variations of both copulas and *ga* or *kedo*. Copulas can be *da* or *desu*. *Desu* is more polite than *da*, and it appears more frequently in our corpus. This is a natural consequence of the nature of the corpus, in which the speakers are not familiar with their listeners. There are no remarkable variations of *ga*, while there are some variations of *kedo*: *keredomo* and *kedomo*. In the following sections, I will sometimes call this marker *kedo*. Keep in mind, however, that there are variations of *kedo* as well as of the copulas preceding it.

### **4.2.3.1 Evoked and inferable elements cannot be coded by the copula followed by** *ga* **or** *kedo*

Evoked elements cannot be coded by *kedo*. This is exemplified in (39), where the ice cream that H had kept in the fridge is assumed by speaker Y to be evoked in H's mind . It is appropriate to assume that the referent 'ice cream' is evoked in H's mind because H opens the fridge.

### 4 Particles

	- Y: **aisu-{??da-kedo/wa}** ice.cream-{cop-though/top} taroo-ga Taro-*ga* tabe-tyat-ta-yo eat-pfv-past-fp 'Taro ate up (your) ice cream.'

In a similar way, inferable elements cannot be coded by the marker. as shown in (40), where 'ice cream' is assumed to be inferable because they are talking about the things in the fridge and both of them know that there was ice cream there.

	- H: I'm sure that there are still rice cakes remaining.
	- Y: un yeah demo but **aisu-{??da-kedo/wa}** ice.cream-{cop-though/*wa*} taroo-ga Taro-*ga* tabe-tyat-ta-yo eat-pfv-past-fp 'Yeah, but Taro ate up (your) ice cream.'

### **4.2.3.2 Declining and unused elements can be coded by the copula followed by** *ga* **or** *kedo*

Declining elements can be coded by *kedo*. As discussed above, there is no simple way to identify declining elements. The declining status appears to be related to intervention of other topics; when the speaker shifts one topic to another topic and mentions the first one again, the first topic is considered to be declining. In example (41), the speaker introduced the first (fame) and the second (work) topics at the same time in line a. She talks about the first one from line b-f, then moves on to the second one in line g, where the second topic (work) is considered to be declining.

	- b. Concerning fame,
	- c. I have been participating in various piano competitions.
	- d. So far the best award I received was the fourth best play in the China-Japan International Competition.
	- e. Beyond that, I would like to receive higher awards.
	- f. Titles matter a lot for pianists, so I will work hard.

4.2 So-called topic particles


As discussed in 4.2.1.2, 'tea time' in the example (19), repeated here as (42), is not established as a topic yet (and hence cannot be coded by *toiuno-wa*). This kind of referent can also be coded by *kedo*. *Kedo* is able to upgrade the referent to the topic status.

	- b. in addition, there is tea time and we can take a break while we climb the mountain,
	- c. so, we walked without feeling that we were in a big group.
	- d. de and kono this **thii-taimu-nan-desu-keredomo** tea-time-nmlz-cop.plt-though
		- 'And at this tea time,'
	- e. kono this hyookoo-no elevation-gen takai high tokoro-de-wa place-loc-*wa* koozanbyoo-toiu altitude.sickness-quot hizyooni very kikennna dangerous kanoosee-ga possibility-*ga* aru-node exist-because 'this place of high elevation, there is a possibility of altitude sickness, so...'
	- f. ee fl mizu-ga water-*ga* hizyooni very zyuuyooni important nari-masu become-plt 'water is very important.' (S01F0151: 323.00-349.56)

There is only one non-anaphoric element coded by *kedo* as in (43), while the other six examples are anaphoric. In this example, the speaker has been talking about travel to Hawaii, then she mentions 'the traveling style', which is coded by *kedo*.

	- b. anoo fl watasi-wa 1.sg-*wa* moo fl kekkoo to.some.extent ma fl tabi-nare-teru-to travel-is.used.to-quot iu-ka say-q

### 4 Particles

'I'm used to travel to some extent, so to speak...' (S00F0014: 300.43-309.95)

This kind of example may be considered to be inferable; traveling is associated with its style. However, the association might be too weak. I categorize this example as a marginal inferable case where *kedo* functions to upgrade the referent to the topic status.

Unused elements can be coded by *kedo*, as shown in (44). In this example, it is assumed that speaker Y and hearer H share the knowledge about a particular ice cream but it is not evoked in H's mind because s/he is just in school.

	- Y: sooieba by.the.way **aisu-{da-kedo/??wa}** ice.cream-{cop-though/top} taro-ga Taro-*ga* tabe-tyat-ta-yo eat-pfv-past-fp 'By the way, Taro ate up (your) ice cream.'

### **4.2.3.3 Further analysis of the copula followed by** *ga* **or** *kedo*

The above examples of *kedo* might be considered to be clauses rather than phrases because *ga* and *kedo* are subordinate clause markers. In (45), *kedo* (realized as *keredomo*) is a subordinate-clause marker; the clause has the subject *pointo* 'point' and the predicate *kirauea-kazan* 'Kilauea'. Thus, all the examples of topics coded by *kedo* above might also be considered predicates of copula clauses.

(45) a. sono fl hawai-too-no Hawaii-island-gen ma fl kankoo-no sightseeing-gen itiban most sono fl ookina big pointo-tteiuno-ga point-*toiuno-ga* **kirauea-kazan-nan-desu-keredomo** Kilauea-volcano-nmlz-cop.plt-though 'The biggest sightseeing point on Hawaii island is Kilauea...' b. anoo fl kirauea-kazan-mo Kilauea-volcano-also mappu-o map-*o* kai-masi-te buy-plt-and de and zibun-tati-de self-pl-by ma fl renta-kaa-o rent.a-car-*o* tobasi-te drive-and e fl iki-masi-ta go-plt-past '(We) bought a map, drove a rental car, and went to Kilauea by ourselves.' (S00F0014: 836.05-850.16)

### 4.2 So-called topic particles

However, there are differences between examples like (45-a) and topics coded by *kedo* discussed in preceding sections, as was mentioned in §2.4.2.6. First, it is actually impossible to "recover" the subject of alleged copula clauses with topiccoding *kedo*, while it is possible in general for the copula predicate followed by *kedo* to have a subject. For example, one cannot "recover" the subject of the alleged copula clause (44), while examples like (45-a) do have a subject. Therefore, the former is considered to be a kind of phrase, whereas the latter is a kind of clause.

Second, topic elements coded by *kedo* are presupposed to be shared between the speaker and the hearer, while predicates of copula clauses followed by *kedo* like (45) are not presupposed to be shared. This is supported by the *hee* test. As shown in (46), *kedo*-coded topics cannot be repeated as news preceded by *hee* 'oh, really'.


On the other hand, the predicate of copula clauses followed by *kedo* can be repeated as news, as shown in (47).

(47) A: sono fl hawai-too-no Hawaii-island-gen ma fl kankoo-no sightseeing-gen itiban most sono fl ookina big pointo-tteiuno-ga point-*toiuno-ga* **kirauea-kazan-nan-desu-keredomo** Kilauea-volcano-nmlz-cop.plt-though 'The biggest sightseeing point on Hawaii island is Kilauea...' (S00F0014: 836.05-842.87) B: hee, Oh kirauea-kazan-nan-da Kilauea-volcano-nmlz-cop

'Oh, Kilauea volcano.' (Constructed)

Although these two kinds of *kedo* are distinct, they are related to each other.

### 4 Particles

Niwa (2006: Chapter 9) argues that *ga*-coded subordinate clauses state the background of the main clause and that this use of subordinate *ga* then grammaticalized into a topic marker. However, historical investigations are necessary to support this claim and I leave it open for future studies.

### **4.2.4 Ø**

As mentioned earlier, zero particles do not appear frequently in our corpus for reasons of style. As a result, most examples in this section are constructed rather than naturally produced.

There are two kinds of zero particles: a topic-coding zero particle (Ø ) and a focus-coding zero particle (Ø ). There are at least three differences, as summarized in (48) (see also Niwa 2006; Nakagawa & Sato 2012).


The elements coded by Ø are by definition assumed to be shared between the speaker and the hearer. Also, they precede other arguments and are followed by an accentual-phrase boundary. On the other hand, those coded by Ø are by definition assumed not to be shared between the speaker and the hearer. They appear close to the predicate and are not followed by the accentual-phrase boundary; rather, they are produced in a single intonation contour with the predicate. As shown by the contrast between (49) and (50), the element *nezumi* 'mouse' preceding another argument *neko* 'cat' is felicitous when the speaker and the hearer share the referent in question as in (49-Y), while it is not when they do not share the referent as in (50-Y). On the other hand, the element 'mouse' adjacent to the predicate *tukamae-ta* 'caught' is felicitous when they do not share the referent as in (50-Y′ ), while it is not when they share the referent as in (49-Y′ ).

	- Y: **nezumi-Ø**, nezumi-Ø neko-ga cat-*ga* tukamae-ta-yo catch-past-fp 'The cat caught (the) mouse.'

4.2 So-called topic particles

	- H: Anything fun today?
	- Y: ??**nezumi-Ø**, mouse-Ø neko-ga cat-*ga* tukamae-ta-yo catch-past-fp Intended: 'The cat caught a mouse.' ′
	- Y : neko-ga cat-*ga* **nezumi-Ø** mouse-Ø tukamae-ta-yo catch-past-fp 'The cat caught a mouse.'

Similarly, Niwa (2006: Chapter 10) reports that topical elements such as *ano ko* 'that girl' and *ree-no seerusuman* 'the salesman' are felicitously zero-coded clause-initially, as the contrasts between (51-a–b) and (52-a–b) show.

	- a. oi hey keiri-ka-ni accounting-section-dat **ano** that **ko**-{**ga/Ø**} girl-{*ga/Ø*} hait-ta-zo enter-past-fp 'Hey, that girl joined the accounting section.'
	- b. oi hey **ano** that **ko**-{**ga/Ø**} girl-{*ga/Ø*} keiri-ka-ni accounting-section-dat hait-ta-zo enter-past-fp 'Hey, that girl joined the accounting section.' (Niwa 2006: 293-294)

On the other hand, focal elements such as *kawaii ko* 'a cute girl' and *dokokano seerusuman* 'a salesman' are not felicitously zero-coded clause-initially, as the contrasts between (53-a–b) and (54-a–b) show.

### 4 Particles


Note that *wa* is unnatural in all of the examples (51) through (54), although I interpret these elements as topics. As I have discussed in §4.2.2, *wa* codes elements referring to evoked or inferable entities. *Ano ko* 'that girl' in (51) and *reeno seerusuman* 'the salesman' in (52) are unused. Hence, *wa*-coding is unnatural in this case, while *ga*-coding is natural. The question which naturally arises is whether these elements are actually topics. I argue that unused elements are ambiguous between topic and focus. They are topics in the sense that the referent in question is shared between the speaker and the hearer via shared knowledge or common sense; and they are foci in the sense that the referent is newly introduced into the discourse.

Throughout this section, I mainly discuss P (the patient-like argument in transitive clauses) preceding A (the agent-like argument in transitive clauses) because it is clear that it is preposed; where preposed Ps tend to be topics, as we will see in Chapter 5.

### **4.2.4.1 Evoked, inferable, declining, and unused elements can be coded by Ø**

Evoked elements can be coded by Ø , as exemplified in (55). In this example, 'mouse' is assumed to be evoked in H's mind since he is looking at the mouse trap. In this case, *wa*-coding is also natural.

	- Y: **nezumi-{Ø/wa}**, nezumi-{Ø/*wa*} neko-ga cat-*ga* tukamae-ta-yo catch-past-fp 'The cat caught (the) mouse.' (Evoked topic P)

This judgement might be too subtle for some readers. Here I am assuming that H is thinking about the mouse because s/he is checking the trap right now. Given this assumption, Y can felicitously use *wa* as well as zero-coding.

Inferable elements can also be coded by Ø , as shown in (56). *Hyoosi* '(book) cover' is used instead of *nezumi* 'mouse', which is easily associated with a book and is assumed to be inferable from the book mentioned earlier. Again,*wa*-coding is also natural in this case.

	- Y1: Thank you for the book. It was interesting.
	- Y2: **hyoosi-{Ø /wa}** cover-{Ø /*wa*} neko-ga cat-*ga* yabui-tyat-ta break-pfv-past gomen sorry 'The cat broke the cover. Sorry.' (Inferable topic P)

Declining elements can be coded by Ø , as shown in (57), where 'mouse' is assumed to be declining. The mouse belongs to the speaker and is mentioned first in (57–Y2). Then the speaker mentions the cat in (57–Y3-4), and then mentions the mouse again in (57–Y5), which is assumed to be declining.

	- Y2: The mouse ran really quickly.
	- Y3: But the cat was also running very fast.
	- Y4: The cat seemed to be hungry.
	- Y5: de kekkyoku uti-no **nezumi-{Ø /wa/??da-kedo}** neko-ga tukamaetyat-ta-yo

### 4 Particles

and eventually our-gen mouse-{Ø /*wa*/*cop*-though} cat-*ga* catch-pfvpast-fp 'Finally the cat caught our mouse.' (Declining topic P)

In this example, a passive variant of the sentence is preferable to an active one like (57–Y5), since the mouse belongs to the speaker but the cat does not. I will discuss this issue further in association with subjecthood in §4.4. Moreover, *wa* is acceptable and *kedo* is not acceptable in (57–Y5), contrary to the generalization in Table 4.1. I suspect that this is because the referent 'mouse' is the center of the speaker's interest; the mouse is still evoked, which causes *wa*, rather than *da-kedo*, to be natural.

Unused elements can be coded by Ø , as exemplified in (58), where the referent 'mouse' is assumed to be unused because there is no clear evidence that H is thinking about the mouse at the time of utterance, though Y and H share the mouse that bothers them.

	- Y: **nezumi-{Ø/??wa/da-kedo}**, nezumi-{Ø/*wa*/cop-though} neko-ga cat-*ga* tukamae-ta-yo catch-past-fp 'The cat caught (the) mouse.' (Unused topic P)

### **4.2.4.2 Difference between Ø and explicit forms**

In addition to stylistic differences, there are further divergences between Ø and explicit forms such as *toiuno-wa*, *wa*, and *kedo*. First, the functional category of the topic element within a clause is less clear when the topic is coded by explicit markers, while the category needs to be clear if the topic is zero-coded. For example, in (59), where *thii-taimu* 'tea time' is originally coded by *kedo*, 'tea time' and the following clause are only vaguely connected and the status of the topic element in terms of grammatical function (such as subject or object) within the clauses is not clear. In this case, coding elements with Ø is difficult.

(59) a. de and kono this **thii-taimu-{nan-desu-keredomo/(??Ø )}** tea-time-nmlz-cop.plt-though 'And at this tea time,'

4.2 So-called topic particles


Another difference between zero-coded elements and explicitly coded elements is whether backchannel responses such as *un* 'yeah' are possible right after the production of the topic element in question. For example, in (58), repeated here as (60), it is difficult to insert a backchannel response such as *un* 'yeah' after *nezumi-Ø* , but it is possible after *nezumi-da-kedo*.

	- Y: **nezumi-{Ø/da-kedo}**, nezumi-{Ø/cop-though} neko-ga cat-*ga* tukamae-ta-yo catch-past-fp 'The cat caught (the) mouse.' (=(58))

This suggests that the speaker assesses the hearer's state of knowledge through *kedo*, i.e., whether the hearer can recall the referent of the *kedo*-coded element that is supposed to be shared between the speaker and the hearer, while this assessment effect is weaker in zero-coding.

### **4.2.5 Summary of topic markers**

The findings regarding topic coding were summarized in Table 4.1, repeated here as Table 4.7 for convenience. The results indicate that topics are heterogeneous, but at the same time, can be accounted for in terms of the given-new taxonomy. Closer analyses also revealed that the given-new taxonomy is continuous and that there are borderline cases.

The characteristics of *toiuno-wa* discussed in §4.2.1 are a combination of the descriptions of Masuoka & Takubo (1992) and Takubo (1989). The statements that include *toiuno-wa*-coded elements describe the general characteristics of the referents. Although it is not always the case that the speaker assumes that the

### 4 Particles

Table 4.7: Topic marker vs. activation status and the given-new taxonomy


hearer does not know the referent in question, the speaker might assume that s/he knows more about it than the hearer. For example, in (61), *hawai* 'Hawaii' is coded by *toiuno-wa*, where I do not believe that the speaker assumes that the hearer(s) do(es) not know Hawaii, since the islands are too famous. However, the speaker might assume that she knows more about Hawaii than the hearer(s).

(61) **hawai-toiuno-wa** Hawaii-*toiuno-wa* ma fl nihon-zin-ga Japan-person-*ga* totemo very suki-de like-and 'Hawaii, Japanese people love it.' (S00F0014: 1145.00-1147.55)

In addition to the characteristics pointed out by the previous literature, this study found that *toiuno-wa*-coded elements tend to be evoked at the time of utterance and tend to be mentioned repeatedly in the following discourse: *toiuno-wa* codes important topics.

The discussion in §4.2.2 showed that *wa* codes elements referring to entities which are evoked or inferable through related elements. This is not only compatible with, but also elaborates on the observation that *wa* codes elements that have been "entered into the registry of the present discourse" (Kuno 1973b: 45). I provided a cognitive model which adequately captures the distribution of *wa*coding and showed the range of *wa*-coding, i.e., what can and cannot be coded by *wa*. This chapter also provided a unified account for *wa*-coding in general, i.e., *wa*-coding including generic and contrastive *wa*. Of course, further empirical investigations are necessary to test whether the observations proposed here are supported or not.

The discussion in §4.2.3 supports previous observations on this topic expression: it is used to newly introduce topics at the beginning of a discourse or of a paragraph (Koide 1984; Takahashi 1999). I re-examined this observation in terms of the given-new taxonomy.

The discussion in §4.2.4 distinguished topic vs. focus zero particles, following Niwa (2006) and Nakagawa & Sato (2012). This section investigated the topic zero

4.3 Case markers

particles and made it clear that they can code elements referring to all entities in the given-new taxonomy if the entities are shared between the speaker and the hearer.

### **4.3 Case markers**

While topic markers code topics with different statuses of the given-new taxonomy, as discussed in the previous section, in this section I will argue that elements coded by the case markers *ga* and *o* are foci. For example, in (62), the *ga*coded element *doobutu-aigo-kyookai* 'animal shelters' and the *o*-coded elements *kihu* 'donation' and *koto* 'thing' can be repeated as news after *hee*.


Oh, {animal shelters/donation/such a thing}

It has been pointed out by many scholars that elements coded by case markers in Japanese are foci. Lambrecht (1994), for instance, argues that *ga* is appropriate for focal elements and not appropriate for topical elements. For example, compare (63) and (64). In (63), where the speaker's neck is presupposed to be at issue at the time of utterance in (63-A), only *wa*-coding is natural, although a zero pronoun is more natural in this context.

(63) Q: How's your neck? A: [kubi-{??ga/**wa**}] neck-{*ga*/*wa*} itai] hurt 'My neck HURTS.' Lambrecht (1994: p.137)

### 4 Particles

In (64), on the other hand, where the speaker's neck is not presupposed to be at issue at the time of utterance in (64-A), *ga*-coding is more natural than *wa*-coding.

(64) Q: What's the matter? A: [kubi-{**ga**/??wa} neck-{*ga*/*wa*} [itai] hurt 'My NECK HURTS.' (ibid.)

In the following sections, I will discuss focus coding mainly by means of case particles including zero (Ø ). The distribution of particles is summarized in Table 4.8 (repeated from Table 4.2), where A indicates the agent-like argument of a transitive clause, S indicates the only argument of an intransitive clause, and P indicates the patient-like argument of a transitive clause (Comrie 1978; Dixon 1979). Since zero-coding typically appears only in casual speech, the main source for the generalization in Table 4.8 are grammaticality judgements.

Note that Table 4.8 is also a semantic map; a scale of agentivity on the one hand and one of contrastiveness on the other. Here I categorize argument focus together with contrastive focus under the label of "contrastive focus" because, as far as *ga*/*o* vs. zero-coding is concerned, argument and contrastive focus do not differ from each other; *ga*/o̧ overtly codes argument and contrastive focus (of P and patient S), whereas zero-coding is preferred elsewhere.

I argue that the Semantic Map Connectivity Hypothesis in (1) applies to this table: the category coded by each marker should map onto a connected region in conceptual space. In the following sections, I will discuss each case particle.


Table 4.8: Overt vs. zero case markers

As mentioned earlier, there are few zero particles in the corpus because of the style of the corpus, and the majority of discussions in this section also rely on grammaticality judgements rather than corpus studies or other experimental methods.<sup>8</sup>

<sup>8</sup>This section is based on a part of the discussion in Nakagawa (2013).

4.3 Case markers

### **4.3.1** *Ga*

This section considers the marker *ga*. I distinguish *ga* coding A and S, and *ga* in the argument- and sentence-focus environment.

### **4.3.1.1** *Ga* **coding focus A**

Focused As require *ga* regardless of whether the element in question is contrastive or not. As exemplified in (65), only *ga*-coding is natural and *o*- and zerocodings are not natural with non-contrastive focus A.

(65) a oh **neko-{ga/\*o/??Ø/}** cat-{*ga*/*o*/Ø} nezumi mouse oikake-teru chase-prog 'Look! A cat is chasing a mouse.' (Non-contrastive focus A)

The unnaturalness of the zero-coding in (65) is not necessarily because A is not adjacent to the predicate. As shown in (66), where A is adjacent to the predicate, zero-coding is still not natural, whereas *ga*-coding is.

(66) Q: Do you know where my mouse is? A: **neko-{ga/\*o/??Ø}** cat-{*ga*/*o*/Ø} oikake-te-ta-yo chase-prog-past-fp 'The cat was chasing it.' (Non-contrastive focus A)

Contrastive focus (or argument focus) A is only naturally coded by *ga*; other markers are not natural. This is exemplified in (67), where only *neko* 'cat' rather than the whole clause is the domain of focus.


### **4.3.1.2** *Ga* **coding focus S**

Agent S is obligatorily coded by *ga*, while patient S can be coded by either *ga* or Ø when S is a non-contrastive focus, as already pointed out by Kageyama (1993: 56-57). As shown by the contrast between (68) and (69), agent S is naturally coded by *ga*, but not *o* or Ø as in (68), while patient S can be naturally coded by either *ga* or Ø , but not *o* as in (69).

### 4 Particles


Contrastive S is always coded by *ga* regardless of whether S is agent or patient.


Note that it is more natural to code non-contrastive focus animate patient S by *ga* rather than Ø , as exemplified in (72).


### 4.3 Case markers

### **4.3.1.3** *Ga* **coding animate elements?**

Some might think that the choice between *ga* vs. Ø is sensitive to animacy rather than agentivity. As has been discussed in Chapter 1, I rather take the view that a single marker can code complex features; the marker *ga* codes focus, agent, and animate elements and one cannot determine a single feature that *ga* codes. Comrie (1979) calls this **seepage**. In Hindi, for example, the postposition *ko* codes definite or animate (especially human) direct objects, while other kinds of direct objects tend to be zero-coded. There is no simple correlation of *ko* with either animate or definite direct objects. In (73), where do stands for 'direct object marker', *ko* sometimes codes animate elements, as in (73-a), but it sometimes does not, as in (73-c), and it sometimes codes definite elements, as in (73-c) but sometimes not, as in (73-a,d). Therefore, it is difficult to decide on a single feature that *ko* codes. Rather, as Comrie (1979) argues, *ko* codes complex features comprised of animacy, definiteness, and direct object.


In the same sense that *ko* codes complex features, I argue that *ga* codes the complex features of agent, animacy, and focus. First, *ga*, but not Ø , codes inanimate A. For example, in (74), *makku* 'Mac(intosh)' in (74-a) and *baketu* 'bucket' in (74-b) are inanimate As and can only be coded by *ga*; Ø is unnatural in this context. Therefore, in addition to animacy, *ga* is also sensitive to agentivity.

(74) a. a oh **makku-{ga/?Ø}** Mac-{*ga*/Ø} koe koe dasi-ta produce-past 'Wow, a Mac produced voice!'

### 4 Particles

b. a oh **baketu-{ga/?Ø}** bucket-{*ga*/Ø} doa door osae-teru hold-prog 'Oh a bucket holds the door (and this is why the door won't close).' (Inanimate A)

### **4.3.1.4** *Ga* **coding non-nominative focus**

*Ga* also codes non-nominative focus. For example, *poteto-tippusu-to* 'with potato chips' in (75-a) and *ima-made* 'before now' in (75-b) are non-nominative, as shown in the translation; however, they are coded by *ga*.


Similarly, *guratan-ni* 'for gratin' in (76-B) is not an argument of the predicate but is still coded by *ga*.


The following examples are from a comic book and from the Internet. One can find many examples of *ga* coding non-nominative on the Internet. Note, however, that especially (77-b) is not acceptable to some people.


<sup>9</sup>This nice example was suggested by Yuji Togo.

<sup>10</sup>Toriyama, Akira (1990) *Dragon Ball* 23, p. 149. Tokyo: Shueisha.

<sup>11</sup>http://tabelog.com/ehime/A3801/A380101/38006535/dtlrvwlst/2992604/, last accessed on 03/23/2015

4.3 Case markers

c. ie-ni home-dat kaeru-**made**-**ga** return-lim-*ga* ensoku-desu excursion-cop.plt 'Until (you) arrive at home is the excursion. (Just before you arrive at home, you are traveling.)' (Common warning by school teachers)<sup>12</sup>

There are examples of *ga* coding non-nominative focus in actual spoken data. The following examples are from *the Chiba three-party conversation corpus* (Den & Enomoto 2007), which includes more casual conversations than CSJ. In (78), *sono hoo* 'that way' is marked by *ga* even though *okane* 'money' is the only argument of the intransitive predicate *kakaru* 'to take (time) or to cost'. The speaker compares buying a computer with other options, and claims that buying a computer costs more. Buying a computer is interpreted as focus and is coded by *ga*, while 'money' is S.

(78) **sono** that **hoo-ga** way-*ga* okane-Ø money-Ø kakaru-zyan required.intr-fp 'More money costs in THAT way (i.e., if you buy a computer).' (chiba0232: 400.32-401.43)

In (79), after listening to an angry story from another participant, the speaker claims that it was the speaker together with the other participant, rather than just the other participant, who were angry in this story. *hara* 'belly' is the only argument of the intransitive predicate *tatu* 'stand'. *hara tatu* 'belly stands' is an idiomatic expression meaning 'to be angry'. In this example, however, *ore-tati* 'we' is coded by *ga* because it is focused.

(79) are-wa that-*wa* musiro rather **ore-tati-ga** 1sg-pl-*ga* **hara-Ø** belly-Ø tat-ta-yo-ne stand.intr-past-fp-fp 'In that event, WE got angry (rather than you).' (chiba0432: 111.64-113.37)

These examples are the cases where *ga* purely codes focus: *ga* codes neither agent nor animate elements.

To summarize, *ga* sometimes codes animate patients S as in (72), sometimes it codes non-animate agent like in (74), sometimes it codes non-nominative inanimate focus elements, as in (75) to (79), and, probably more frequently, it codes elements with the complex features of agentivity, animacy, and focus. Like *ko* in

<sup>12</sup>I found 32,700 websites using this expression with Google exact search (searched on 06/17/2015).

### 4 Particles

Hindi, *ga* codes multiple features and it is difficult and not necessary to determine a single feature that it codes.

### **4.3.2** *O*

### **4.3.2.1** *O* **coding focus P**

Non-contrastive focus P is usually zero-coded, while contrastive focus P is coded naturally only by *o*. This is shown by the contrast between (80) and (81). In (80), where the question elicits a broad focus structure, zero-coding is the most natural option, while *ga*- and *o*-codings are less natural.

	- A: **tetugaku-{\*ga/?o/Ø}** philosophy-{*ga*/*o*/Ø} benkyoo study si-te-n-da-yo do-prog-nmlz-decl-fp 'I study philosophy.'

In (81), on the other hand, where the question elicits a narrow focus structure, overt *o*-coding is more natural than *ga*- and zero-codings.

(81) Q: What do you study? A: **tetugaku-{\*ga/o/??Ø}** philosophy-{*ga*/*o*/Ø} benkyoo study si-teru-n-da-yo do-prog-nmlz-decl-fp 'I study philosophy.'

Some native speakers of Japanese might not find the *o*-coding in (80) unnatural, contrary to my claim. I argue that *o*-marking of non-contrastive focus in casual conversation is limited to theatric speech. According to Nakagawa (2013), who studied a casual spoken corpus of *manzai* (a popular stand-up comedy performed by two people), 75% (222 examples) of 297 P-codings are zero-coding, while only 25% (75 examples) are *o*-coding. Although this corpus survey does not distinguish contrastive vs. non-contrastive foci, it is clear from it that the vast majority of P-coding in casual spoken Japanese is Ø.

### **4.3.3 Ø**

As discussed in the previous sections on *ga* and *o*, non-contrastive focus P and patient S are coded by Ø . As shown in (65), non-contrastive focus A can only be naturally coded by *ga*, while zero-coding is not natural. As discussed in relation to examples (68) and (69), non-contrastive agent S can only naturally be coded by *ga*, but not Ø, while non-contrastive patient S can be coded by either *ga* or Ø. As shown in (80), non-contrastive P can only be coded naturally by Ø.

4.4 So-called subjects

### **4.3.4 Summary of case markers**

The distribution of case markers including zero particles was summarized in Table 4.8. This study revealed the distribution of case particles and zero particles in terms of information structure. The previous literature was not clear about the relationships between the twofold characteristics of *ga*: nominative and exhaustive listing vs. neutral description. Following Comrie (1979), this study has proposed that a single particle has multiple features at the same time. The particles *ga* and *o* are used in focus environments; at the same time, they indicate the functional relation of the element coded by these particles. In particular, *ga* even codes non-nominative focus elements, which indicates that the particle is grammaticalizing into a focus particle. In §4.5.2, I will discuss why the particle *ga*, among other particles, is starting to code focus.

### **4.4 So-called subjects**

In this section, I will briefly discuss the relationships between grammatical functions and information structure. This is associated with an issue that has long been discussed in the literature: the connection between topic and subjects (Li 1976; Du Bois et al. 2003). Since it is impossible to provide an overview of all the things that have been discussed in this longstanding debate, I briefly discuss a few points.

### **4.4.1 Subject and topic**

Whereas Aoki (1992: 2) reported that 84.7% of *wa* attaching to nouns code socalled subjects (A and S in my terms, nominative case in her terms) in novels and essays, only 40.3% of *wa* in our data code As and Ss, as shown in Table 4.9 and in Figure 4.3. The table and and the figure include all the elements excluded in other analyses.<sup>13</sup> Figure 4.4, which represents the overall frequencies of elements, is shown for comparison. This graph also includes all elements excluded in other graphs. On the other hand, Table 4.9 and Figure 4.3 show that 59.0% of *toiunowa* code so-called subjects. This demonstrates that *toiuno-wa* in spoken Japanese is in fact closer to *wa* in written Japanese in terms of preference in the coding of grammatical functions. Although most of the literature focuses on *wa* coding subjects, these results suggest that *wa* codes other kinds of elements in spoken Japanese.

<sup>13</sup>Refer to §3.4.3.2 to see what is excluded.

### 4 Particles


Table 4.9: Topic markers vs. grammatical function

Figure 4.3: Topic markers vs. grammatical function

Figure 4.4: Overall distributions of elements

So-called subjects have a special status in discourse; they are interpreted as definite even if the NP is coded by *ga* instead of *wa*. For example, consider the difference between (82) and (83).


These utterances represent a propositional meaning that can be paraphrased as '(a/the) car ran over (a/the) dog.' Note that since Japanese does not have obvious ways to code definiteness, both 'car' and 'dog' can be potentially interpreted as either definite or indefinite, and hence 'car' and 'dog' are expressed in the same way in (82) and (83) except for the case markers. Under these conditions, the subjects 'car' in (82) and 'dog' in (83) are interpreted as definite, while the nonsubjects 'car' in (83) and 'dog' in (82) are indefinite, according to the author's intuition. NPs coded by *wa* are also likely to be interpreted as definite since the referent of those NPs are assumed to be evoked. This observation suggests that subjects without topic-marking still function as topic markers. This is worth investigating in the future since my argument is no more than an impressionistic analysis.

### **4.4.2 Hierarchy of topic coding**

There seems to be a hierarchy of topic coding; given As and Ss are more likely to be coded by topic markers than given Ps. For example, consider the following example. In (84), *sohu* 'grandfather' is introduced in line a, and *pan* 'bread' is introduced in line b. In line c, which is of interest in the discussion, *oziityan* 'grandfather' is coded by *wa*, but *sore* 'that', which refers to the bread in line b, is coded by the case particle *o*.


It is unnatural for *wa* to code *sore* referring to the bread instead of *oziityan* 'grandfather', as shown in (85-c′ ). If A (e.g., *obaatyan* 'grandmother') is newly introduced, as in (85-c′′), there is no problem for *wa* coding *sore*; *obaatyan* 'grandmother' is naturally coded by *ga* instead of *wa*.

4.4 So-called subjects

(85) c ′ . e fl n frg sore-{o/wa} that-{*o*/*wa*} i frg maa fl yoowa in.a.word ??**oziityan-ga** grandfather-*ga* issyookenmee trying.best taberu-n-desu-keredomo eat-nmlz-cop.plt-though 'that, my grandfather tries his best to eat it, but...' c ′′ . e fl n frg sore-{o/wa} that-{*o*/*wa*} i frg maa fl yoowa in.a.word **obaatyan-{ga/??wa}** grandmother-*ga*/*wa* issyookenmee trying.best taberu-n-desu-keredomo eat-nmlz-cop.plt-though 'that, my grandmother tries her best to eat it, but...' (modified from

(85-c))

In fact, most anaphoric Ps are still coded by *o*, instead of topic markers, whereas a higher ratio of anaphoric As and Ss are coded by topic markers. Tables 4.10 and 4.11 and Figures 4.5 and 4.6 show the distribution of topic and case markers coding A, S, and P. Table 4.10 and Figure 4.5 represent the distribution of topic and case markers coding anaphoric A, S, and P. As the table and the graph show, while 44.1% of anaphoric As and 38.8% of anaphoric Ss are coded by topic markers, only 8.4% of anaphoric Ps are coded by topic markers. On the other hand, the majority of non-anaphoric elements are coded by case markers, although non-anaphoric Ss (most of which are in fact inferable) are coded by *wa* remarkably more often than others.

I propose the hierarchy in (86) for topic coding. Given elements that are higher in this hierarchy are more likely to be coded by topic markers.

(86) A, S > P

The hierarchy indicates that so-called subjects are more likely to be coded by topic markers. This hierarchy is a topic hierarchy: the hierarchy of elements which are more likely to be topics (Givón 1976; Keenan 1976; Comrie 1979; 1983; Du Bois 1987). This hierarchy is present in many languages in various ways. For example, A and S are more likely to agree with the verb than P cross-linguistically. Also, A and S are more likely to be zero-coded than P. Japanese *wa*-coding seems to follow this hierarchy; if there are two given elements potentially coded by *wa*, A and S are preferred over P following the hierarchy in (86).

### **4.4.3 Ex or detached NPs**

Finally, I discuss associations between "Ex" and topic markers. In §3.4.3.3, Ex-s were defined as elements "which appear to be part of the clause but do not have

### 4 Particles


Table 4.10: Markers for anaphoric elements

Table 4.11: Markers for non-anaphoric elements


Figure 4.5: Markers for anaphoric elements

### 4.4 So-called subjects

Figure 4.6: Markers for non-anaphoric elements

direct relationships with the predicate" (p. 88). A typical example is shown in (87). In (87), the predicate *nagai* 'long' is directly related to *hana* 'nose'. *Zoo* 'elephant' is not directly related to the predicate; it is not the elephant itself that is long.

(87) **zoo-wa** elephant-*wa* hana-ga nose-*ga* nagai long 'The elephant, the nose is long (The elephant has a long nose).' (Mikami 1960)

Tables 4.10 and 4.11 and Figures 4.5 and 4.6 show that Ex is only coded by topic markers. Tables 4.9 and Figures 4.3 show that 21.7% of *toiuno-wa*-coded elements and 5.9% of *wa*-coded elements are categorized into Ex.

Lambrecht (1994) discusses cross-linguistic cases of Ex (in his term, "detached" topic) and argues that "in some languages at least, the detached topic NP cannot be a constituent [...] of the clause with which it is pragmatically associated" (p. 192). In (88), examples in English, the detached topics are not constituents of the clause; rather, they are in a part-whole relation with some element(s) within the clause. In (88-a), the detached topic *the typical family today* is not a constituent of the clause; instead, it is associated with *the husband and the wife* pragmatically. In the same way, the detached topics *tulips* in (88-b) and *other languages* in (88-c) are pragmatically associated with constituents of the clause – *bulbs* and *tones*, respectively.

### 4 Particles

	- b. (Talking about how to grow flowers) **Tulips**, you have to plant new *bulbs* every year?
	- c. (Lecture in an introductory linguistics course) **Other languages**, you don't just have straight *tones* like that. (Lambrecht 1994: 193)

These detached topics are strikingly similar to "Ex" in Japanese.

Lambrecht also discusses cases in which topics are not counted as constituents of the clause even though they appear to be constituents. German, for example, has a principle that only allows the verb in the second position of a clause, as exemplified in (89-a-d). However, the detached topic constituents that appear at the beginning do not count as the first constituent of the clause. As exemplified in (89-e), the verb *isst* 'eats' appears in the second position assuming that the preceding *den* 'it' is in the first position, which indicates that the detached topic *den Apfel* is not the first constituent in the clause. In fact, as in (89-f), it is unacceptable if the detached topic *den Apfel* is counted as the first constituent.<sup>14</sup>


Both the topicalized NP *den Apfel* and the resumptive pronoun *den* in (89-e) appear as accusative. According to Lambrecht, however, it is optional for the topi-

<sup>14</sup>*Apfel* 'apple' in e, f of (89) is considered to be "detached" because the resumptive pronoun *den* 'it.acc' is regarded as argument of the clause and *Apfel* itself does not function as argument.

### 4.4 So-called subjects

calized NP, while it is obligatory for the resumptive pronoun. This is also reminiscent of topic-marking in Japanese. In Japanese, nominative and accusative codings are overridden by topic-marking, and the case of A, S, and P is not overtly expressed when they are coded by topic markers, as has been discussed in §2.4.2.4.

The fact that topics tend to be "detached" from the predicate and lose case marking cross-linguistically suggests the possibility that there are some universal motivations behind this phenomenon. I argue that at least one of the motivations is clause-chaining. In clause-chaining, the speaker combines multiple clauses to form a thematic unit (Longacre 1985; Martin 1992; Givón 2001). (90) is an example of clause-chaining.

(90) She came in, [Ø] stopped, [Ø] looked around and froze. (Givón 2001: 349)

By combining clauses in this way, thematic continuity is achieved. In clausechaining, the detached topic, which typically appears utterance-initially, as will be discussed in Chapter 5, is not necessarily an argument of the clauses; instead, it is pragmatically related to the following clauses. For example, in (91), where the speaker talks about life in Iran, *mukoo-no hito* 'people there (in Iran)' in (91-a) is detached and annotated as "Ex", since its predicate *hukaku* 'deep' – which is in a part-whole relation with 'people' – has the so-called subject *hori* '(face) form'. In (91-b-c), the speaker continues to talk about her by clause-chaining. *Kodomo* 'child' in (91-c) also has a part-whole relation with the Iran people.

	- b. kiree-de beautiful-and 'beautiful,'
	- c. kodomo-nanka-wa child-hdg-*wa* anoo fl sugoku very kawaii cute kao-o face-*o* si-tei-mashi-ta do-prog-plt-past 'children had very cute faces.' (S03F0072: 375.01-386.35)

Clause-chaining is a useful way to talk about something; the speaker puts the topic at the beginning and continues to describe the topic as much as s/he can. In the descriptions found in clause-chaining, the topic is not necessarily an argument, rather, it is pragmatically associated with each clause. The hearer does not get lost. The hearer can trace the topic when the speaker provides enough evidence through linguistic expressions (such as particles, word order, and into-

### 4 Particles

nation) and other means (such as gestures, background knowledge, sequence of conversation, etc.).

Mikami (1960: Chapter 2) points out that *wa*-coded NPs can "go beyond periods" (p. 117) and "commas" (p. 130). This is closely related to what I argue here. He states: "in general, 'X-*wa*', skipping adverbial clauses in the middle, governs the final main clause. However, it [sometimes] governs the verbs in the middle a little bit; this is what I call [*wa*'s] going beyond commas" (p. 130). Of course, there are no commas and periods in spoken language, *wa* and *toiuno-wa* go beyond "commas" and "periods" by governing the whole clause-chaining.

### **4.5 Discussion**

### **4.5.1 Distribution of markers and semantic space**

Figure 4.7: Anaphoric distance vs. expression type (all)

As discussed in §4.1, Japanese particles code elements with features that can be mapped onto a conceptual space. As reflected in Table 4.1 and discussed in §4.2, topic markers map onto a conceptual space of the given-new taxonomy; while, as shown in Table 4.2 and discussed in §4.3, case markers map onto a conceptual space of agentivity, focushood, contrastiveness, and possibly animacy.

The semantic map of topic markers in Japanese indicates that inferable and evoked statuses form a connected region and are expressed by the same marker, *wa*, while declining and unused statuses form a connected region and are expressed by the same marker, a copula followed by *kedo* or *ga*; hence, the inferable

### 4.5 Discussion

status is closer to the evoked status, and the declining status is closer to the unused status in the conceptual space. This makes sense because inferable elements are more relevant to the current topic than declining elements. For example, in (92), the inferable element *gen'in* 'cause' is coded by *wa*. The element 'cause' is inferable because the disease has already been introduced, it can be considered common knowledge that there is a cause for the disease.

	- b. First I visited several local hospitals.
	- c. I was examined several times, but
	- d. **gen'in-wa** cause-*wa* humee-de unclear-cop 'the cause (of the disease) was unclear.' (S02F0010: 74.93-82.60)

In (92), the cause of the disease is relevant to the current topic, i.e., the speaker's disease. Later in this speech, the speaker talks about her parents and friends; in this case the cause of the disease is considered to be declining and is less relevant to the current topic (her parents and friends). Declining elements like the cause of the disease become unused as time progresses. If the speaker brings up the cause of the disease two days later, she will code it as unused. Thus, I argue that the adjacency of the inferable and evoked statuses, as well as the adjacency of the declining and unused statuses, are cognitively motivated, and argue that this is universal.

Moreover, I propose that there are at least two kinds of evoked statuses: evoked, and what I call "strongly evoked". Evoked elements are full NPs, and strongly evoked elements are zero and overt pronouns. Figure 4.7 shows the time difference (anaphoric distance) on a logarithmic scale between the time when the first mora of the element in question is produced and the time when that of its antecedent is produced. Zero pronouns are assumed to be produced at the time when the first mora of the predicate is produced. The anaphoric distance approximates activation cost; smaller distance indicates lower activation cost, while larger distance indicates higher activation cost. Figure 4.7 represents the anaphoric distance of three kinds of elements: full NPs, pronouns, and zero pronouns. As is clear from the figure, the anaphoric distance of zero and overt pronouns is smaller than that of NPs, which indicates that zero and overt pronouns are more evoked than full NPs (fixed effects model, < 0.001). Therefore, I propose the status called "strongly evoked". I add this status in Table 4.12. Since overt pronouns coded by topic markers are as strongly evoked as zero pronouns, I suppose that the topic markers *wa* and *toiuno-wa* can also code strongly evoked elements.

### 4 Particles

Markers for focus coding map onto agentivity, focushood, contrastiveness, and possibly animacy as has been discussed in §4.3. Table 4.8 in §4.3 indicates that A and agent S are adjacent to each other, and patient S and P are adjacent. This makes sense because A is conceptually closer to agent S, and P is conceptually closer to patient P.


Table 4.12: Topic marker vs. activation status and the given-new taxonomy

### **4.5.2 Distribution of markers and markedness**

As discussed in §4.3 and summarized in Table 4.8, the distinction between overt vs. zero particles for focus coding is sensitive to grammatical functions, contrastiveness, and animacy. The distribution of overt vs. zero particles for noncontrastive focus coding in Table 4.8 is similar to that of split intransitive languages, if one ignores *ga*-coding for patient S. In general, split intransitive languages code S differently depending on whether it is an agent or a patient; agent S is coded in the same way as A in the transitive clause, while patient S is coded in the same way as P. (93) shows examples from Georgian.<sup>15</sup>

### (93) Georgian, South Caucasian


<sup>15</sup>Examples are from the handouts in the lecture called Typology and Universals given by Matthew Dryer at the University at Buffalo in 2010. Glosses are modified.

4.6 Summary

c. rezo-Ø Rezo-p gamoizarda 3.grow 'Rezo grew up.' (Patient S)

Spoken Japanese and Georgian in (93) follow the typological tendency that agent S and A tends to be overtly coded, while patient S and P tends to be zero-coded. On the other hand, spoken Japanese does not follow the tendency of nominative/accusative languages: the tendency that A and S (nominative elements) are more likely to be zero-coded than P (accusative elements). I argue that, in coding focus elements, patient elements are "unmarked", i.e., more frequent than agent elements, and are more likely to be zero-coded than agent elements. This is supported by studies such as Du Bois (1987) and Du Bois et al. (2003). On the other hand, as regards topic coding, agent elements are more frequent than patient elements, and are more likely to be zero-coded than patient elements. This is observed in another dialect of Japanese: Kansai Japanese. In Kansai Japanese, contrastive topic agents (A and agent S) can be zero-coded, while contrastive topic patients (P and patient S) are overtly coded, as summarized in Table 4.13. See Nakagawa (2013) for a more detailed discussion on the relation between markedness and the distribution of zero vs. overt particles in Standard and Kansai Japanese.

Table 4.13: Contrastive-topic coding in Kansai Japanese


As has been discussed in §4.3.1.4, *ga* sometimes codes non-nominative focus NPs. The theory of markedness also gives a hint to explain why *ga* is on its way to grammaticalizing into a focus particle: focus A is the most rare in naturally occurring discourse and it is likely for Japanese native speakers to associate the marker *ga* with focushood. On the other hand, P is very frequently focused, in which case, it is less likely to associate the marker *o* with focushood.

### **4.6 Summary**

### **4.6.1 Summary of this chapter**

This chapter discussed the distributions of so-called topic markers and case markers in Japanese. I argued that different markers are sensitive to different features, and at the same time, multiple features contribute to the usage of a single marker.

### 4 Particles

### **4.6.2 Remaining issues**

While there are many remaining questions, one of the biggest issues is that it is necessary to test the proposals made in this chapter through other empirical methods. If the proposals are also supported by other methods, they become more sound. In particular, the distribution of zero particles is mainly based on the acceptability judgements of a few native speakers. This should be tested with a larger number of speakers. One possible experiment is to ask subjects to listen to short conversations where the particles in question are blurred, and then have them produce what they hear. This is an easier task than subtle acceptability judgements, and linguistically naïve subjects can also participate in it.

Another issue is the focus test. So far we only had the *hee* test and the *no* test, both of which depend on the author's acceptability judgements. One possible experiment is to ask subjects to listen to the speech used in this study and respond to what the speaker means by *hee* as if they were the hearers. The elements that many subjects respond to are more likely to be foci. Another possibility is to investigate conversations and study the elements that the hearer actually responds to. Den et al. (2012) annotated response tokens like *hee* and the elements addressed by them. It might be possible to use this annotation to test the second hypothesis.

# **5 Word Order**

### **5.1 Introduction**

This chapter discusses how the information structure of a clause affects word order.

Figure 5.1 shows the overall distribution of elements in terms of their positions in a clause. Elements are counted by phrases (so called *bunsetsu*). The y-axis indicates the frequency of the elements and the x-axis indicates the position of the elements: 1 means that the element in question appeared in the first position of the clause, 2 means that it appeared in the second position, and so on. I used the values of nth originally included in CSJ. The reason why the frequencies of 1 and 2 are lower than those of 3 is that the linguistic categories that appear in the first or second position are typically fillers, connectives, and adjectives and are excluded from the analysis. The fact that the elements later than fifth in the clause appear very frequently might be counterintuitive based on the ordinary idea of a clause, since a clause consists of a single predicate and at most three arguments and a few more adjuncts. In spoken language, however, there are many fillers, intensifiers such as *hontooni* 'really', and paraphrases, which make the clause longer. Since nth simply counts the position of a phrase in terms of linear position, and not structurally, embedded clauses such as relative clauses are also included in the count. I assume that it is worth including these intervening expressions to analyze where a phrase can be interrupted by them and where it cannot. In fact, the following results show that most non-anaphoric elements appear immediately before the predicate, not interrupted by fillers, intensifiers, and so on (see §5.4). Moreover, CSJ has a unique definition of clause, which is not always the same as the intuitive definition; rather, a clause in CSJ is closer to a single series of clause chains. For example, some subordinate markers such as *-to* 'if' and *-te* 'and' do not work as clause boundaries. These characteristics cause more elements to appear in later positions. See Maruyama et al. (2006) for a detailed definition of clause unit.

Figure 5.2 and 5.3 show element positions and their frequencies based on information status and persistence, respectively. The information status "anaphoric"

### 5 Word Order

in this study just means that the element in question has a co-referential antecedent and "non-anaphoric" means that it does not. "Persistent" means that the referent in question is also mentioned in the following discourse, and "nonpersistent" means that it is not. See §3.4.3.2 for the details of the annotation procedure. As was discussed in §4.2, a linear mixed effects model was employed to predict information status (anaphoric vs. non-anaphoric). As fixed effects, word order (nth in CSJ, see §5.1 for the definition of this annotation), particles (*toiunowa, wa, mo, ga, o, ni*), and intonation (phrasal vs. clausal IU, see §6.1 for the definitions) were included, and as a random effect, the speaker (TalkID) was included. The model with the effects of word order, particles, and intonation is significantly different from the models without each of them, which indicates that word order, particles, and intonation respectively contribute to the prediction of information status. The model with all three effects is significantly different from the model without the effect of word order (likelihood ratio test, < 0.01); it is significantly different from the model without the effect of particles ( < 0.001) and the model without the effect of intonation ( < 0.05)

As was also discussed in §4.2, a linear mixed effects model was also applied to predict persistence (persistent vs. non-persistent). Word order, particles, and intonation were included as fixed effects, and the speaker (TalkID) was included as a random effect. The model with the effects of word order and particles is again significantly different from the models without either of them (likelihood ratio test, < 0.01 for the model without word order, < 0.001 for the model without particles). However, the model with the effect of intonation is not significantly different from the model without it ( = 0.423). The results are to be discussed in more detail in §5.2.

Figure 5.4 shows the overall distribution of elements in terms of their distance from the predicate; 1 indicates that the element appears right before the predicate, 2 indicates that there is one element between the preceding element and the following predicate, and so on. If the element appears right after the predicate, the distance is counted as -1. Since the number of post-predicate elements is too small to make any generalization, they are excluded from the figures. Postpredicate elements will be discussed in comparison with dialogues in §5.3.

Figures 5.5 and 5.6 show the distance between the element and the predicate depending on information status and persistence. A linear mixed effects model of information status (with the distance from the predicate and particles as fixed effects and the speaker as a random effect) indicates that whereas the model with particles is significantly different from the model without them (likelihood ratio test, < 0.001), the difference between the models with and without the distance from the predicate is only marginally significant ( = 0.060). This entails that the

### 5.2 Clause-initial elements

Figure 5.1: Order of all elements

effect of particles significantly contributes to the model, but the effect of distance is inconclusive (see §5.4 for discussion). On the other hand, a linear mixed effects model of persistence (fixed and random effects are the same as above) shows that the effects of both particles and distance are significant to the model ( < 0.01 for both the model without particle and that without the distance). The results are also to be discussed in further detail in §5.4.

### **5.2 Clause-initial elements**

This section discusses clause-initial elements. In 5.2.1, it will be argued that shared elements (i.e., unused, declining, inferable, or evoked elements) tend to appear clause-initially, and in §5.2.2, that persistent elements also do. From these observations, it will be generalized that topics tend to appear clause-initially, as predicted from the previous literature. Finally in §5.2.3, I discuss the reasons why topics appear clause-initially.

### 5 Word Order

Figure 5.2: Word order vs. infoStatus

### **5.2.1 Shared elements tend to appear clause-initially**

Figure 5.2 shows the frequency of elements and their positions based on information status. Anaphoric elements appear most frequently in the third position. On the other hand, non-anaphoric elements appear most frequently in the fourth position, but those in the fifth and sixth positions also appear frequently. These distribution of elements in different information statuses appear to replicate the classic observation that topics tend to appear earlier in a clause, i.e., the from-oldto-new principle (Mathesius 1928; Firbas 1964; Daneš 1970; Kuno 1978; Gundel 1988). This principle is explicitly formulated in (1).

(1) **From-old-to-new principle**: In languages in which word order is relatively free, the unmarked word order of constituents is old, predictable information first and new, unpredictable information last. (Kuno (1978: 54), Kuno (2004: p. 326))

This principle is motivated by the accumulative nature of utterance processing; old (or given) elements work as anchors that relate the previous utterance and the following utterance. This principle appears to be supported by examples such

Figure 5.3: Word order vs. persistence

as the following. In (2), *sore* 'it' in line c, referring back to *kasi-pan* 'sweetbread' in line b, precedes the A element *oziityan* 'grandfather'.

	- b. yoku often pan-ya-san-de bread-store-hon-loc kasi-pan-o sweet-bread-*o* kat-te buy-and kuru-n-desu-ga come-nmlz-cop.plt-though '(He) often buys sweet bread and comes home,' c. e fl n frg **sore-o** it-*o* i frg maa fl yoowa in.a.word oziityan-wa grandfather-*wa* issyookenmee trying.best taberu -n-desu-keredomo eat-nmlz-cop.plt-though 'that, he tries his best to eat it, but'
		- d. he cannot eat all and
		- e. gives the leftovers to the dog... (S02M0198: 244.48-262.82)

### 5 Word Order

Figure 5.4: Distance from predicate

Note that *sore* 'it' in line c is not coded by *wa* but by *o*. This shows that clauseinitial shared elements are not necessarily coded by topic markers, although it is predicted that elements coded by topic markers would be more likely to appear clause-initially than those coded by case markers (see the discussion in §5.2.1.1).

Similarly in (3), *sore* 'it' in line c refers back to *buraunkan* 'cathode ray tube' and appears at the beginning of the clause, preceding other elements.


'this (cathode ray tube), (people) brought it from here to there.'

### 5.2 Clause-initial elements

Figure 5.5: Distance from predicate vs. Information status

d. some people were doing something like that. (S05M1236: 471.26-490.38)

However, this is not the whole story; there are many counter-examples where non-anaphoric elements precede anaphoric ones. Table 5.1 shows the number of cases where anaphoric precedes non-anaphoric and non-anaphoric precedes anaphoric within the same clause. There are 102 cases where anaphoric precedes non-anaphoric, while there are 63 cases where non-anaphoric precedes anaphoric. The cases where anaphoric precedes non-anaphoric only slightly outnumber the cases where non-anaphoric precedes anaphoric. 63 cases (39.4%) is too large a number to believe that they are mere exceptions to the principle in (1).

I do not claim that the principle in (1) is not correct, but I do claim that the principle does not apply to all cases. Anaphoric elements precede non-anaphoric elements if the anaphoric elements are assumed to refer to the "same" entity which has already been mentioned. In other words, shared elements precede nonanaphoric elements. For example, in (4), *mizu* 'water' is repeatedly mentioned in the utterance, but it is never produced clause-initially. I argue that this is because

### 5 Word Order

Figure 5.6: Distance from predicate vs. persistence



*mizu* 'water' in (4-b) and later is not assumed to refer to the "same" entity already mentioned in the previous discourse.

(4) a. desukara so daitai approximately iti-niti-ni one-day-for ni-rittoru-no two-liter-gen **mizu-o** water-*o* tot-te drink-and kudasai-to please-quot iw-are-te tell-pass-and 'So we were told to drink two liters of water per day,' b. syokuzi-no meal-gen toki-wa time-*wa* kanarazu surely magukappu-de mug-with ni-hai-bun-no two-cup-amount-gen **mizu-o** water-*o* nomi-masu-si drink-plt-and 'whenever we have a meal, we drink two cups of water,'


In the same way, *tenkan* 'epilepsy' appears many times in (5), but it never appears clause-initially.

(5) a. ato moreover ik-kai one-time.cl **tenkan** epilepsy okosi-tara cause-cond sinu-tte die-quot it-te-ta-n-desu-kedo say-past-nmlz-cop.plt-though '(The doctor) said that, if (my dog) gets an epilepsy seizure once more, (the dog) would die, but...' b. mata again so frg sookoo meanwhile si-teru do-prog uti-ni while-dat **tenkan** epilepsy okosi-masi-te cause-plt-and 'meanwhile, (the dog) has an epilepsy seizure, and...' c. The dog recovered this time, but had an epilepsy seizure several times and finally died. (130.8 sec omitted.) d. sono fl boku-ga 1sg-*ga* dekakeru go.out toki-ni when-dat moo already noki-sita-de eave-under-loc **tenkan** epilepsy okosi-te cause-and 'When I left (home), (the dog) had already had an epilepsy seizure, and...' e. tabun probably sin-dei-ta-n-da-roo-to die-prog-past-nmlz-cop-infr-quot 'probably died...' f. ta frg noki-sita-de eave-under-loc **tenkan** epilepsy okosi-ta-ga cause-past-gen tame-ni reason-dat


### 5 Word Order

Whether the speaker refers to the shared entity mentioned previously depends on the speaker's subjective judgement rather than on objective reasoning. In (6), for example, the anaphoric element *kuruma* 'car' in line c does not appear clauseinitially for the same reason as in (4) and (5). However, *kuruma* 'car' in line b and d are clearly the same entity.

> e fl

	- iki-masi-ta go-plt-past '(we) drove there by rent-a-car by ourselves.'
		- (83.52 sec talking about the mountain.)
	- c. de and anoo fl jibun-no self-gen koko frg koko-de here-loc tyotto a.bit tome-te stop-and miyoo-to try-quot omot-ta think-past toko-ni place-dat koo this.way **kuruma-o** car-*o* tome-te stop-and 'At the place (we) wanted to stop, (we) stopped the car,' d. you can take pictures and so on. (S00F0014: 843.23-940.34)

I argue that, in this case, the speaker does not care about the identity of the car. Rather, she focuses on talking about her trip to Kirauea; the car she was in is not important for this speech. As will be discussed in §5.2.2, the importance as well as the identity of the entity contributes to word order in spoken Japanese. Important (i.e., persistent) elements appear clause-initially.

Interestingly, these elements which are repeatedly mentioned but never appear clause-initially are not referred to by zero or overt pronouns. It is especially difficult to zero-pronominalize *tenkan* 'epilepsy' in (5-b-f) and *kuruma* 'car' in (6-d).<sup>1</sup> Zero pronouns are considered to be the most accessible topics (Givón 1983: 17). To zero-pronominalize, the speaker needs to provide signals to let the hearer know which is the topic, as will be discussed in 5.2.3.

From the discussion above, there are at least two predictions that can be tested in the corpus. Firstly, since evoked and inferable elements are coded by topic markers, as was shown in Chapter 4, it is predicted that elements coded by topic

<sup>1</sup> It is difficult to apply this test in (4) because *mizu* 'water' accompanies numeral modifiers such as 'of two liters' and 'two cups of'.

### 5.2 Clause-initial elements

markers tend to appear earlier in a clause (§5.2.1.1). This is because elements assumed by the speaker to be evoked or inferable are also assumed to be shared. Secondly, since pronouns essentially code shared elements which have been mentioned, pronouns are also predicted to appear earlier in a clause (§5.2.1.2). Both predictions are confirmed in the following investigations. Thirdly, I will show that clause-initial elements are not sensitive to activation cost; unused elements can also appear clause-initially (§5.2.1.3). Evoked, inferable, declining, and unused elements are shared (see Table 3.2). Therefore, the claim that shared elements appear clause-initially is supported.

### **5.2.1.1 Topic-coded elements appear clause-initially**

Figure 5.7: Order of arguments coded by topic markers

Let us test the prediction that elements coded by topic markers tend to appear earlier in a clause. Figure 5.7 shows the distribution of topic-coded elements and their positions. Compare this figure with Figure 5.8, which shows the distribution of case-coded elements and their positions. It is clear that elements coded by topic

### 5 Word Order

Figure 5.8: Order of arguments coded by case markers

markers are more skewed towards earlier positions within a clause than those coded by case markers.

(7) is an example of a *wa*-coded element appearing clause-initially. The *wa*coded element *hone* 'bone' in line a, which has been discussed in the previous discourse, is separated from the predicate by an intervening locative (a tomb for animals in the temple). The intervening part is long and the predicate finally appears in line d.

(7) a. ee fl suriipii-no Sleepy-gen itibu-no part-gen oo fl **hone-wa** bone-*wa* 'Part of the bones of Sleepy (dog's name),' b. sono that morimati-no Morimachi-gen watasi-no 1sg-gen senzo-no ancestor-*gen* o fl hait-teru enter-prog otera-no temple-gen 'the temple in Morimachi where my ancestors were,'

5.2 Clause-initial elements


In (8), *sono ko* 'that puppy', whose referent has appeared in line a, is also an example of a *wa*-coded element appearing clause-initially. The element is also separated from the predicate by an intervening argument, 'distemper'.

	- b. **sono** that **ko-wa** puppy-*wa* mata again zisutenpaa-ni distemper-dat kakat-te catch-and sin-zyau-kara die-pfv-because 'the puppy would die of distemper again, so'
	- c. keep a new puppy after this winter, this is what we were told by the vet. (S02M0198: 108.68-126.70)

*Wa* appearing in initial position is already conventionalized, and it is possible to test this with acceptability judgements. It is not acceptable for a *wa*-coded P to appear between the focus agent and the predicate except in contrastive readings of *wa*. As the contrast between (9-a-c) shows, the zero-coded P *hon* 'book' in (9-a) right before the predicate is acceptable, while the *wa*-coded *hon* 'book' in the same position in (9-b) is not acceptable. To express the idea of (9-b), the *wa*coded P should precede the A, *taroo* 'Taro'.

	- b. ??taroo-ga Taro-*ga* **hon-wa** book-*wa* yon-deru-yo read-prog-fp 'Taro is reading the book.'
	- c. **hon-wa** book-*wa* taroo-ga Taro-*ga* yon-deru-yo read-prog-fp 'Taro is reading the book.' (Constructed)

There is only one example (out of 9 *wa*-coded Ps) in the corpus where *wa*-coded P is preceded by *ga*-coded A. This *wa*-coded P is contrastive, a case which will be discussed in §5.5.

### 5 Word Order

I propose the hypothesis that elements which belong to the same unit of information structure appear adjacent within a clause. I call this the informationstructure continuity principle in word order.

(10) **Information-structure continuity principle**: A unit of information structure is continuous in a clause; i.e., elements which belong to the same unit are adjacent to each other.

This principle explains why (9-b) is not acceptable, while (9-a,c) are. The information structure of each of the examples in (9) is represented in (11). In (11-b), the topic P element *hon-wa* 'book-*wa*' intervenes between two focus elements, *taroo-ga* 'Taro-*ga*' and *yon-deru* 'read-prog', which is not acceptable. In (11-c), on the other hand, the topic P does not split up the domain of focus, and the whole sentence is acceptable. In (11-a), all the elements including *hon* 'book' belong to focus and hence *hon* in this position is acceptable.


Interestingly, it is possible for a *wa*-coded A to be preceded by an *o*-coded P, as shown in (12-a) (compare this with (12-b)).

	- b. hon-o book-*o* **taroo-ga** Taro-*ga* yon-deru-yo read-prog-fp 'Taro is reading the book.'

As was argued above, the preposed P, *hon-o* 'book-*o*' in (12), is topical, which is represented as in (13).

(13) a. [hon-o book-*o* **taroo-wa**] Taro-*wa* [yon-deru] -yo read-prog-fp 'Taro is reading the book.'

5.2 Clause-initial elements

b. [hon-o] book-*o* [**taroo-ga** Taro-*ga* yon-deru] -yo read-prog-fp 'Taro is reading the book.'

As shown in (13-a), the two topic elements *hon-o* 'book-*o*' and *taroo-wa* 'Taro-*wa*' are adjacent to each other and hence this sentence is acceptable. Also in (13-b), the only topic element *hon-o* 'book-*o*' does not split up the focus elements *tarooga yon-deru*, which is predicted to be acceptable. *Hon-o* 'book-*o*' could be a focus instead of a topic in (12-b), since given elements can be foci. But it is reasonable to think of a situation where given focus elements are preposed so that there is a smooth transition from the previous sentence. The information-structure continuity principle in (10) still holds in either case.

Note that (10) does not refer to word order; rather, it is about adjacency. I argue that this principle is also at work in intonation (see Chapter 6).

What is the difference between clause-initial elements coded by topic markers and those coded by case markers? As was discussed in §4.4.2, there is a hierarchy of topic coding (86), which is repeated here as (14).

(14) A, S > P

The hierarchy indicates that an evoked or inferable A or S is more likely to be coded by a topic marker than a P in the same activation status. Word order is not affected by this hierarchy. Figures 5.9 and 5.10 show word order of anaphoric S and P, respectively. Compare these with Figures 5.11 and 5.12, which show the word order of non-anaphoric S and P. The word order of A is omitted because the number is too small. As can be seen from the contrasts between Figures 5.9 and 5.11 and between Figures 5.10 and 5.12, anaphoric elements are more likely to appear earlier in the clause than non-anaphoric elements. Although the contrast is less clear between anaphoric vs. non-anaphoric P, what is especially notable is that there are three times as many anaphoric Ps as non-anaphoric Ps in the third position. (There are 27 anaphoric Ps in the third position, while there are only 10 non-anaphoric Ps.) I speculate that the contrast is less clear in anaphoric vs. non-anaphoric P than S because there are cases like (4) and (5), where the element is annotated as anaphoric but is considered to not be shared. In this case, P appears pre-predicatively rather than clause-initially. Therefore, I argue that, while elements coded by topic markers are likely to appear earlier in the clause, word order is independent of topic marking. Topic markers are sensitive to the given-new taxonomy, as was discussed in Chapter 4; clause-initial position is sensitive to sharedness. Topic markers and word order are sensitive to different aspects of topichood.

### 5 Word Order

Figure 5.9: Word order of anaphoric S

Figure 5.10: Word order of anaphoric P

Figure 5.11: Word order of non-anaphoric S

Figure 5.12: Word order of non-anaphoric P

### 5 Word Order

### **5.2.1.2 Pronouns appear clause-initially**

Next, let us examine the position of pronouns. Figure 5.14 shows the position of pronouns. Figure 5.1, repeated as Figure 5.13 for comparison, represents the distribution of all elements. Although the number of pronouns is small, it is clear, comparing with the overall distribution of elements in Figure 5.13, that the order of pronouns is skewed towards initial positions within a clause. Hence, it is reasonable to conclude that pronouns are likely to appear earlier in a clause. Examples of pronouns appearing earlier in a clause are shown in (2) and (3) above. This result is compatible with Yamashita (2002) and Kondo & Yamashita (2008).

Figure 5.13: Order of all elements

### **5.2.1.3 Unused elements appear clause-initially**

Not only evoked, inferable, and declining elements, but also unused elements appear clause-initially. Elements coded by the copula followed by *ga* or *kedo* are unused elements, as was discussed in Chapter 4. 2 It is very unnatural for them to be preceded by other arguments. For example, as shown in the contrast between

<sup>2</sup> See §2.4.2.6 for en explanation why an element coded by the copula followed by *ga* or *kedo* is not considered to be a clause.

### 5.2 Clause-initial elements

Figure 5.14: Order of pronouns

(15-a) and (15-b), *rei-no ken* 'that issue' cannot be felicitously preceded by another argument, in this case *kotira-de* 'this side'.

(15) a. **rei-no** that-gen **ken-desu-ga** issue-cop.plt-though kotira-de this.side-loc nantoka whatever nari-sou-desu become-will-cop.plt 'Regarding that issue, (I) guess (we) figured the way out.' (modified from Niwa 2006: 283) a ′ .??kotira-de this.side-loc **rei-no** that-gen **ken-desu-ga** issue-cop.plt-though nantoka whatever nari-sou-desu become-will-cop.plt

In a similar manner, *yamada-no koto* 'the issue of Yamada' cannot naturally be preceded by an adverbial, *ano mama* 'that way', as shown in the contrast between (16-a) and (16-b).

(16) a. **yamada-no** Yamada-gen **koto-da-kedo** issue-cop ano that mama way hot-toi-te leave-let-and ii-no-kana good-nmlz-q 'Regarding Yamada, is it OK to just leave him?' (Niwa 2006: 283)

### 5 Word Order

a ′ .??ano that mama way **yamada-no** Yamada-gen **koto-da-kedo** issue-cop hot-toi-te leave-let-and ii-no-kana good-nmlz-q

Unused elements also include indefinite elements, even though it is counterintuitive to consider indefinite NPs as being "shared". For example, as was mentioned in §3.3.4.2, an indefinite element can appear clause-initially if the speaker assumes the hearer to remember that the speaker (or somebody else) has talked about a category the element refers to. For example, as shown in (17-Y), repeated from (22) in §3.3.4.2, having mentioned the category "mango" makes it possible for *mangoo* 'mango' to appear clause-initially, even though *mangoo* 'mango' is clearly indefinite since the hearer has no way to tell which mango the speaker ate. I regard this as unused and hence shared.

	- Y: **mangoo** mango konoaida the.other.day miyako-zima-de Miyako-island-loc tabe-ta-yo eat-past-fp '(I) ate (a) mango (we talked about) in Miyako island the other day.'
	- Y ′ : konoaida the.other.day miyako-zima-de Miyako-island-loc **mangoo** mango tabe-ta-yo eat-past-fp '(I) ate (a) mango in Miyako island the other day.'

In this case, however, *mangoo* 'mango' in the pre-predicate position is also felicitous, as in (17-Y′ ), which indicates that this is a borderline case; *mangoo* can be a topic in the sense that it is unused and the speaker has talked about it before, while it can be a focus in the sense that it is new to the discourse and indefinite.

On the other hand, in (18-Y), where the speaker does not assume the hearer to remember that the speaker has talked about mangoes, clause-initial *mangoo* 'mango' is infelicitous, whereas pre-predicate *mangoo* is perfectly acceptable.

	- H: What did you do these days? Y: ??**mangoo** mango konoaida the.other.day miyako-zima-de Miyako-island-loc tabe-ta-yo eat-past-fp (=(17-Y)) Y ′ : konoaida the.other.day miyako-zima-de Miyako-island-loc **mangoo** mango tabe-ta-yo eat-past-fp '(I) ate (a) mango in Miyako island the other day.' (=(17-Y′

))

### 5.2 Clause-initial elements

Therefore, it is reasonable to conclude that shared elements include those which refer to categories the speaker (or somebody else) has talked about, and that they can appear clause-initially.

### **5.2.2 Persistent elements tend to appear clause-initially**

Persistent elements are skewed to earlier positions more than non-persistent elements, as shown in Figure 5.3.

The following are examples of persistent elements appearing clause-initially. In (19), *hihu-byoo* 'skin-disease' in line a, coded by the topic marker*toiuno-wa*, appears clause-initially. The predicate appears in line c, separated from the subject by a proposition in line b and by another clausal argument (*hito-ni* 'person-by') in line c. Also, in line d, *kore-wa* 'this-*wa*', referring to 'skin-disease', appears clause-initially.

	- b. damat-tei-temo keep.silent-prog-even.if 'even if you don't tell people about it,'
	- c. hito-ni person-by mir-are-te-simau see-pass-and-pfv mono-dat-ta-node thing-cop-past-because 'people can see it, so'
	- d. **kore-wa** this-*wa* ano fl omot-ta think-past izyooni more seesintekini mentally kutuu-desi-ta painful-cop-past 'this was more mentally painful than I had expected.' (S02F0100: 222.75-231.09)

Similarly, in (20), *sore-wa* 'that-*wa*' in line b and g, and *sore-dake-wa* 'that-only*wa*' in line i, all of which refer to 'chelow kebab' in line a, appear clause-initially.

### (20) a. There is a dish called **chelow kebab**.


### 5 Word Order

	- not.exist-and
	- 'It did not have smell of mutton...'

As was mentioned in 5.1, both word order and particles significantly contribute to predict persistence, contrary to the result of Imamura (2017), who concludes that "scrambling [PSV order] is pertinent to anaphorically prominent but cataphorically non-prominent objects and that topicalization is especially germane to 'continuing topic' as the referent of the object" (p. 78). There are a few potential reasons why the results of the present work are different from those of Imamura (2017). One potential reason is the difference of modalities: Imamura (2017) employed a corpus of written Japanese (*the Balanced Corpus of Contemporary Written Japanese*, BCCWJ), while the present study employs spoken Japanese. Related to the first point, clause-chaining – which, as I will point out, is one of the reasons why clause-initial elements tend to be persistent (see the next section) – only appears in spoken Japanese, but not in written Japanese. In any case, this is a mere speculation and further studies are needed to analyze why the results of these two studies differ.

### **5.2.3 Motivations for topics to appear clause-initially**

As was pointed out by many linguists, topics tend to appear clause-initially because they function as an anchor to the previous discourse. The principle in (1) is motivated by this processing convenience (e.g., Keenan 1977). Clause-initial locatives and other adjectives can also be explained by this motivation. This anchoring function works best when the activation cost of the referent is relatively high (Givón 1983); i.e., when the referent of the element in question is inferable or declining. When the activation cost is low, i.e., when the topic is continuous from the previous discourse, the element in question that refers to the topic is expected to be zero (Givón 1983; Gundel et al. 1993; Ariel 1990); there is no need for

### 5.2 Clause-initial elements

anchoring because the topic is already evoked and the hearer expects it to also be mentioned in the current sentence. This explanation predicts that the distance between the element in question and the antecedent is larger when the element in question is expressed in the form of an NP instead of zero. Figure 5.15 appears to support this prediction, although a statistical analysis indicates that the expression type does not significantly contribute to predict distance. This paragraph discusses NPs with long distance. See the discussion below for NPs with shorter distance. The whisker plot in Figure 5.15 shows the distance between the element in question (NP vs. (explicit) pronoun vs. zero pronoun) and its antecedent. It measures the time between the production of the first mora of the element in question and the production of the first mora of the antecedent. The figure shows that, in many cases, the distance between the NP and the antecedent is larger than that of zero and the antecedent. Zero pronouns are assumed to be produced at the time when the first mora of the predicate is uttered.

This pattern is exemplified in (21), where zero pronouns are indicated by *Ø*. In line b, *san-nin-me* 'the last person' precedes adjuncts ('last fall') and is coded by a variation of *toiuno-wa* (*ttuuno-wa*). Zero pronouns *Ø* are inserted right before the predicate for the purpose of presentation, but this does not affect the analysis. Since this person is one of the three people mentioned in line a, s/he is inferable through a part-whole relation. The topic moves on to another person in line f, who is also one of the three people mentioned in line a. In line j, the speaker again refers to the person mentioned in line b. Also, this time the element *moo hitori-wa* 'the other person' appears near the beginning of the clause, preceding other arguments. The referent continues to be mentioned until line q. Finally, the speaker starts talking about himself in line r, in which case the element *boku-wa* '1sg-*wa*' appears near the beginning of the clause.

	- b. de and anoo fl **san-nin-me-ttuuno-wa** three-cl-ord-*toiuno*-*wa* tui just se frg ee fl kyonen-no last.year-gen o fl aki-ni fall-in yame-ta-n-desu-kedomo quit-past-nmlz-cop.plt-though 'The last person quit this fall.'
		- c. **soitu-wa** 3sg-*wa* maa fl itiban most saisyo-ni first-in yame-tai quit-want yame-tai quit-want ttut-ta quot.say-past ningen-nan-desu-kedomo person-nmlz-cop.plt-though 'He was the first person who said he wanted to quit.'
		- d. This kind of thing happens often.

### 5 Word Order


5.2 Clause-initial elements

r. de boku-wa-to ii-masu-to then 1sg-quot say-plt-cond 'Talking about myself...'

s. ... (S05M1236: 639.40-738.22)

In this type of example, clause-initial elements, especially those coded by topic markers, function as an anchor to the previous discourse.

Figure 5.15: Anaphoric distance vs. expression type

However, Figure 5.15 also indicates that (explicit) pronouns (*kore* 'dem.prox (this)', *sore* 'dem.med (this/that)', *are* 'dem.dist (that)', *kare* '3sg.m (he)', *kanozyo* '3sg.f (she)')<sup>3</sup> and zero pronouns do not differ from each other. Moreover, there are NPs which refer to the immediate antecedent. Whereas more than half of the NPs have a longer distance than explicit and zero pronouns, the figure also shows that many NPs have distances as short as those of explicit and zero pronouns. In fact, a fixed effects analysis for distance (with expression type as a fixed effect and speaker as a random effect) indicates that expression type is not a significant factor to predict distance. For example, in example (21), the referent of *hitori* 'one person' in line f is mentioned in line h as *sono hito* 'that person' again, although the distance is not very large.<sup>4</sup> In a similar manner, the referent of *san-nin-me*

<sup>3</sup>*Kare* '3sg.m (he)' and *kanozyo* '3sg.f (she)' are very rare in spoken Japanese. Instead, *kono hito* 'this person' or similar expressions are used more frequently. However, this study does not count them as pronouns.

<sup>4</sup>The impression of line g is that of an inserted clause rather than a topic shift.

### 5 Word Order

in line b is mentioned in the immediately following clause (line c) as *soitu* '3sg'. These examples are not mere exceptions. In fact, 74.1% of referents mentioned for the second time are still expressed in the form of an NP; only 21.4% are expressed as zero and 4.6% as a pronoun, as shown in Table 5.2 and Figure 5.16. Figure 5.16 and Table 5.2 show the expression type of the element in question based on how many times the referent is mentioned. "2" indicates that the element in question is mentioned for the second time, "3" indicates that it is mentioned for the third time, and so on. The ratio of zero increases as the referent keeps being mentioned. The fact that the referent introduced is mentioned repeatedly is also reported in Clancy (1980), who investigates Pear Stories; this pattern is not unique to the corpus of the current study. (22) is another example of two NPs which refer to the same referent and which adjacent. In this example, the very long word *yuugosurabia-syakaisyugi-kyoowakoku* 'Socialist Federal Republic of Yugoslavia' is repeated twice.

(22) a. ee fl kon frg ma fl kono this tiiki area ee fl yu frg ma fl **kyuu-yuugosurabia-syakaisyugi-kyoowakoku**-toiu former-Yugoslavia-socialist-republic-quot tokoro-nan-desu-keredomo place-nmlz-cop.plt-though 'This area is called Socialist Federal Republic of Yugoslavia,' b. kono this **yuugosurabia-syakaisyugi-kyoowakoku**-tteiuno-wa Yugoslavia-socialist-republic-*toiuno*-*wa* motomotoga originally ee fl minzoku-tairitu-no ethnic-conflict-gen hagesii severe tiiki-de-gozai-masi-te area-cop-plt-plt-and 'this Socialist Federal Republic of Yugoslavia is an area with severe ethnic conflicts...' (S00M0199: 81.95-94.42)

Why does the speaker repeat the same referent next to its previous mention, although s/he can fairly assume that the it has already been evoked with the first mention? In fact, the second 'Socialist Federal Republic of Yugoslavia' in line b cannot be omitted contrary to what is claimed about the nominal forms (Givón 1983; Gundel et al. 1993; Ariel 1990). Why?

Since the most frequent pronoun in Japanese is the zero pronoun, as indicated in Figure 5.16 and Table 5.2, the speaker needs to make sure that the hearer understands which referent zero pronouns refer to. Therefore, the speaker needs


Table 5.2: Nth mention vs. expression type

5.2 Clause-initial elements

Figure 5.16: Nth mention vs. expression type

to establish the referent as a topic before s/he uses zero.<sup>5</sup> This might be related to the observation in Lambrecht (1994: 136) that focus elements cannot be the antecedent of zero, while topic elements can. Compare (23) and (24) (the acceptability judgements are based on Lambrecht. Information structure is added by

<sup>5</sup> As pointed out by one of the reviewers (Morimoto), it is possible to replace 'this Socialist Federal Republic of Yugoslavia' in line b of (22) with a pronoun-like form such as *kono kuni* 'this country'. My argument here still holds because the pronoun-like form 'this country' is much more informative than the zero pronoun. The following argument by Lambrecht (1994) also suggests that focus can be the antecedent of overt pronouns, but not zero pronouns. See examples (23) and (24).

### 5 Word Order

the present author). In (23), *John* is interpreted as topic (by default) in (23-b), in which case zero is acceptable.

(23) a. John married Rosa, but he didn't really love her. b. [John] [married Rosa] , but Ø didn't really love her.

On the other hand, in (24), *John* is the focus because it is the answer to the question, in which case zero is not acceptable, as in (24-b). Only an explicit pronoun is acceptable, as shown in (24-a).

	- A: a. John married Rosa, but he didn't really love her.
		- b. \*?[John] [married Rosa] , but Ø didn't really love her.

Figure 5.17: Antecedent's word order of NPs

Why do these pronouns or NPs that refer to the immediate antecedent appear (almost) clause-initially? I argue that, in addition to the from-old-to-new principle (1), the persistent-element-first principle works in spontaneous speech.

(25) **Persistent-element-first principle**: In languages in which word order is relatively free, the unmarked word order of constituents is persistent element first and non-persistent element last.

Figure 5.18: Antecedent's word order of zero pronoun

One of the factors which motivate this principle is clause-chaining. In spoken Japanese, a chain of clauses is frequently observed, as schematized in (26), where the speaker announces the topic at the beginning and continues to talk about it in a chain of multiple clauses.<sup>6</sup>

(26) a. Topic b. Clause1 c. Clause2 d. Clause3 e. ...

A specific example of clause-chaining is shown in (27), where the topic 'Everest Trail' in line a is preannounced, and the following clauses (b–f) are about this topic.

### (27) a. **kono** this **eberesuto-kaidoo-toiuno-wa** Everest-trail-quot-*wa*

<sup>6</sup>This is also pointed out by Michinori Shimoji (p.c.) with reference Ryukyuan Languages, which belong to the same language family as Japanese.

### 5 Word Order

'This Everest Trail is'


(S01F0151: 105.73-120.14)

This pattern is useful because the referent talked about in the chain of clauses in question is referred to at the beginning of the chain and the speaker can use the zero pronoun in the following clauses.

Figure 5.17 and 5.18 show the word order of the antecedents of NPs and zero pronouns, respectively. Although the contrast is subtle, the antecedents of zero pronouns are more skewed towards earlier positions than NPs.

Consider example (28). The speaker mentions the topic 'the participants of the trekking' first in line a, and expands on this in the following discourse. After (28-f), the speaker extends the topic and describes each participant.

	- 'to the 72-year-old elderly man,'

5.2 Clause-initial elements


In this kind of example, clause-initial elements do not refer to zero pronouns as constituents in the following clauses, but are only pragmatically associated with the constituents in the following clauses (see also §4.4.3).


Table 5.3: Antecedent's particle vs. current expression type

Not all clause-initial antecedents of zero pronouns are coded by topic markers. Figure 5.19 is a bar plot of expression types of elements based on the particles of their antecedents. According to the figure, the antecedents of zero pronouns are more likely to be coded by *wa* or *toiuno-wa* than those of overt NPs, although many antecedents of zero pronouns are coded by *ga* or *o*.

In example (29), clause-initial *waru-gaki* 'brats', coded by *ga* in line a, is the antecedent of the zero pronoun in line b.

(29) a. a fl dokka-no somewhere-gen kinzyo-no neighborhood-gen **waru-gaki-ga** bad-brat-*ga* sute-inu-o abandon-dog-*o* mi-te look-and 'Brats around here found this abandoned dog, and'

### 5 Word Order

Figure 5.19: Antecedent's particle vs. current expression type


This might sound *a priori* to some readers because Japanese is traditionally argued to be an SOV language: of course *ga*-coded elements are subjects and precede other arguments. However, what I claim is that the persistent-element-first principle in (25), in addition to the from-old-to-new principle in (1), is one of the reasons why so-called subjects (A and S) precede other arguments.

Another motivation has been proposed for clause-initial topics repeated immediately after the first mention. Den & Nakagawa (2013) discuss cases where clause-initial topics are used as fillers. Since topics have already been evoked in the speaker's mind, the cost of producing topics is lower than that of producing new elements. While the speaker utters the topic, s/he plans the following utterance. Den & Nakagawa (2013) investigated conversations and found that the topic elements repeated immediately after the previous speaker's utterance complementarily distribute with fillers. They also found that the length of the final mora of the topic phrase (typically *wa*) correlates with the length of the following utterance (see also Watanabe & Den 2010). In the following example (30), not only is 'Serbian people' repeated twice in line a and b, almost the whole sentence is repeated; the sentences in line a and b convey almost the same proposition. This is another piece of evidence that supports Den & Nakagawa's claim;

while repeating almost the same proposition, the speaker can plan what to say next about this topic.

	- b. ee fl kono this ziki time maa fl **serubia-no** Serbia-gen **kata-tati**-ga person.plt-*ga* maa fl koko-ni here-dat tu frg kokka-o nation-*o* tukut-te make-and ee fl serubia-teekoku-toiu Serbia-empire-quot koto-de thing-cop.and 'Around this time Serbian people built a nation, this is the Serbian Empire and'
	- c. ee fl ryuusee-o flourish-*o* **Ø** Ø kiwame be.extreme '(it) flourished.'
	- d. At that time Catholics were coming from the north, and from the south, the Greek Orthodox were coming,
	- e. though they are both Christian,
	- f. ee fl ni-keetoo-no two-stream-gen syuukyoo-no religion-gen naka-de inside-loc seekatu-o life-*o* **Ø** Ø si-te-iku do-and-go naka-de inside-loc 'While (they) were living surrounded by two streams of religion,' g. ee fl serubia-teekoku-tosite Serbia-empire-as ma fl dotira-o which-*o* erabu-ka-tteiu choose-q-quot na frg ko frg ee fl koto-no thing-gen naka-de inside-cop.and
		- '(they) faced the question of which one to choose.'
	- h. ee fl ma fl minami-gawa-no south-side-gen girisya-seekyoo-o Greek-Orthodox-*o* **Ø** Ø toru choose wake-nan-desu-ga reason-nmlz-cop.plt-though '(They) eventually chose the Greek Orthodox.' (S00M0199: 212.34-221.02)

### 5 Word Order

### **5.2.4 Summary of clause-initial elements**

This section investigated characteristics of clause-initial elements. It turned out that shared and persistent elements tend to appear clause-initially. Not only did this study confirm the classic observation that topics tend to appear clauseinitially, this section and the next analyze what kind of topics appear clauseinitially. I also discussed motivations for clause-initial topics.

### **5.3 Post-predicate elements**

While Japanese is reported to be a verb-final language (Hinds 1986; Shibatani 1990), some elements appear after the verb in spoken Japanese (Kuno 1978; Ono & Suzuki 1992; Fujii 1995; Takami 1995a,b; Ono 2006; Nakagawa et al. 2008). The following are examples of post-predicate elements. Since post-predicate elements are very rare in monologues, the examples are from the dialogue part of CSJ. *Kono hito* 'this person' in (31) and *terii itoo* 'Terry Ito (a person's name)' in (32) are produced after the predicates *yat* 'do' and *kake* 'wear', respectively.


This section investigates the information structure of post-predicate constructions of this kind. Although post-predicate expressions could be adverbs, connectives, and other adjuncts, this study examines only noun phrases.

### **5.3.1 Strongly evoked elements appear after the predicate**

Takami (1995a: 136) argues that postposed elements are elements other than the focus. For example, the answer to a question or *wh*-phrase cannot be postposed naturally. (33) is an example of a postposed element, 'a 10-carat diamond ring', as the answer to a 'what' question . While the sentence itself is natural, the postposed element cannot felicitously answer the question.

(33) Q: What did Taro buy for Hanako?

5.3 Post-predicate elements

A: #taroo-wa Taro-*wa* hanako-ni Hanako-for kat-te buy-and yat-ta-yo give-past-fp **zyuk-karatto-no** 10-carat-gen **daiya-no** diamond-gen **yubiwa-o** ring-*o* 'Taro bought (it) for Hanako, a 10-carat diamond ring.'

Similarly, *wh*-phrases such as *dore* 'which' cannot be postposed, as shown in (34).

(34) \*itiban most oisii-desu-ka delicious-cop.plt-q **dore-ga**? which-*ga* 'The most delicious one, which?'

Nakagawa et al. (2008) found that there are two types of post-predicate construction: the single-contour type and the double-contour type. In the singlecontour type, the post-predicate elements are uttered without a pause and do not have the F<sup>0</sup> peak. In the double-contour construction, on the other hand, post-predicate elements are uttered with a pause and have the F<sup>0</sup> peak. The pitch contours of each utterance are shown in Figure 5.20 for the single-contour type ((35-A) and (36-A)) and 5.21 for the double-contour type ((35-A′ ) and (36-A′ )), both of which were produced by the author. The post-predicate part is *kome-wa* 'rice-*wa*', whose accent nucleus is on *me* and whose overall accent is supposed to be LHL (L indicates low and H indicates high in pitch). In Figure 5.20, where the postposed element is uttered with the same continuous contour as the main clause, one can neither observe the F<sup>0</sup> peak in *me* nor a pause between the predicate and the postposed element. In Figure 5.21, on the other hand, where the postposed element is uttered in a separate contour from the main clause, one can observe the F<sup>0</sup> peak in *me* and a pause between the predicate and the postposed element.

Nakagawa et al. (2008) investigated the difference between these two construction types in terms of information structure and found that the post-predicate elements of the single-contour type are evoked by being mentioned immediately before or through physical context. On the other hand, the post-predicate elements of the double-contour type are not necessarily evoked. For example, compare examples (35) and (36), where the bold-faced letters indicate that they are high in pitch.<sup>7</sup> The referent 'rice' in (35) is evoked because it is mentioned in (35-Q) immediately before the answer to Q is uttered. In this case, (35-A′ ), where the

<sup>7</sup>Here I assume that the pitch accent of *oisii* 'good' is LHHH and that that of *kome-wa* 'rice-*wa*' is LHL.

### 5 Word Order

post-predicate element *kome-wa* 'rice-*wa*' has its own F<sup>0</sup> peak and is preceded by a pause, is not acceptable, while (35-A), where the post-predicate element without its own F<sup>0</sup> peak is uttered immediately after the predicate without a pause, is acceptable.

### (35) **The referent 'rice' evoked**


On the other hand, in (36), where 'rice' is not evoked before the speaker utters (36-A) or (36-A′ ), only the double-contour type (36-A′ ) is acceptable and the single-contour type (36-A) is not natural.

### (36) **The referent 'rice' not evoked**


A remaining issue is to investigate the difference between elements before and after the predicate in terms of information structure.

Figure 5.20: Post-predicate construction: single-contour type

### 5.3 Post-predicate elements

Figure 5.21: Post-predicate construction: double-contour type

Nakagawa et al. (2008) measured the referential distance (RD) between postpredicate elements and their antecedents, i.e., they measured the number of interpausal units between the element in question and its antecedent. They modified the definition of RD from the original one (Givón 1983) and decided to use interpausal unit as a measure of RD, since clause boundaries are sometimes difficult to identify in spoken Japanese. Their results are shown in Table 5.4. The table shows that the average RD of the post-predicate elements of the single-contour type is 6.9, whereas that of the double-contour type is 39.7. What about elements before the predicate?

I conducted the same investigation for elements before the predicate, but this time I used the monologues employed throughout this study because the dialogues Nakagawa and her colleagues used in their study lack the information about the RD of elements before the predicate.<sup>8</sup> Further studies are needed to make sure that elements before the predicate in monologues and dialogues have the same characteristics. Table 5.5 shows the average RDs of elements before the predicate based on their word order. Here, I simplified word order to only count arguments (excluding fillers, fragments, adverbs, adjectives, etc.). 1 indicates that the element in question is the first argument in a clause, *2* indicates that it is the second argument, and so on. The RD of the first argument is 20.9 on average, that of the second argument is 23.0, and the third argument is 41.1. The table shows that the RD of elements before the predicate are larger than that of postposed

<sup>8</sup>Nakagawa et al. (2008) counted the RD of non-anaphoric elements as 100 (the maximum value of RD), but the present study did not include non-anaphoric elements, since I thought that this is ad hoc. This modification makes the RD of elements before the predicate (conducted in this study) smaller. This has only a small effect and the overall conclusion does not change because according to our result, the RD of pre-predicate elements are larger than that of post-predicate elements; if this study employed the same criteria as Nakagawa et al., the RD of elements before the predicate would be expected to be even larger.

### 5 Word Order

elements of the single-contour type, regardless of their word order. The RD of double-contour postposed elements is similar to that of preposed elements in the third position. I do not have an explanation for the RD of double-contour postposed elements. I believe that postposed elements of the double-contour type are heterogeneous; some might be an afterthought, some might have interactional functions (Ono 2007), while others might be something else (Tanaka (2005); Guo & Den (2012), see also the discussion in §5.3.2.3). What I want to emphasize here is that the RD of the single-contour postposed elements is smaller than that of elements before the predicate. The postposed elements of the single-contour type are evoked when they are uttered; their activation cost is low. Taking into consideration the fact that many of the post-predicative elements are pronouns or nouns preceded by demonstratives (Nakagawa et al. 2008), I propose that postpredicative elements are often strongly evoked. On the other hand, the activation cost of preposed elements is higher than that of postposed elements.<sup>9</sup>

Table 5.4: RD of post-predicate elements


Table 5.5: RD of elements before predicate


The following are examples of post-predicate constructions from dialogues. (37) and (38) are examples of the single-contour type. The postposed elements of this construction are typically pronouns or elements modified by the demonstratives *kono* 'dem.prox (this)', *sono* 'dem.med (this/that)', or *ano* 'dem.dist (that)'. In (37), the postposed element is the pronoun *kore* 'dem.prox (this)'. The participants are working on a task about ranking famous people based on how much they earn. The utterance is produced in the middle of this task and the demonstrative *kore* refers to the ranking so far. Therefore, the referent of *kore* is expected to be evoked in the participants' mind. As shown in Figure 5.22, where the upper

<sup>9</sup>The average RD of zero pronouns is 5.0, which shows that post-predicate elements of the single-contour type is close to zero pronouns.

5.3 Post-predicate elements

box indicates the intensity of the utterance and the lower box indicates the F<sup>0</sup> , the postposed element *kore* does not have an F<sup>0</sup> peak.

(37) L: sugoi awful tatakai-da-yo-ne battle-cop-fp-fp **kore** this '(It) is an awful battle, this?' (D02F0025: 463.93-465.81)

In (38), where the participants are involved in the same task as (37), *kono hito* 'this person' is the famous person under discussion right now – hence the referent is evoked in the participants' mind. Figure 5.23 shows the intensity and the F<sup>0</sup> of the utterance in (38). Although the F<sup>0</sup> of the postposed element is not shown because the speaker's spoke quietly, the intensity tells us that the postposed part is uttered without a pause. Also, the fact that the intensity is low indicates that the postposed element is only weakly uttered because the referent is sufficiently evoked.

(38) R: nani what yat-teru-no do-prog-nmlz **kono** this **hito** person 'What is (he) doing, this person?' (D02M0028: 193.30-194.45)

Common nouns can also be postposed elements of the single-contour type, as shown in (39). In (39), where the participants are involved in the same task, the postposed element *syasin* 'photo' is uttered without a pause or F<sup>0</sup> peak, as shown in Figure 5.24. Since R, the other participant, is physically holding the photos and this is part of the rules of the task, it is reasonable to assume that the participants have already evoked the photos.

(39) L: siro-kuro-desu-ka white-black-cop.plt-q **syasin** photo 'Are (they) black-and-white, the photos?' (D02F0015: 313.95-315.26)

On the other hand, postposed elements in the double-contour type have not been sufficiently evoked or they are contrastive at the time of utterance. In (40), where the participants are again involved in the task of ranking famous people based on their income, *kotti-wa* 'on my side' is uttered in a separate contour from the main clause, and there is a pause between the main clause and the postposed element, as shown in Figure 5.25. 'On my side' is necessary information in the sense that the other participant, L, was talking about how many people were listed on her own side. Therefore, participant R might have thought that 'there are ten people' is not enough and added 'on my side' later. The F<sup>0</sup> peak of the

Figure 5.22: Intensity and F<sup>0</sup> of the single-contour type (37)

Figure 5.23: Intensity and F<sup>0</sup> of the single-contour type (38)

postposed element *kotti-wa* 'on my side' is still lower than *zyuu* 'ten' in the main clause, and the intensity is also lower. This is because the postposed element is not the focus, as Takami (1995a,b) has pointed out. Foci are typically new in the given-new taxonomy and need both an F<sup>0</sup> peak and intensity in order for the hearer to understand clearly what is said.


### 5.3 Post-predicate elements

Figure 5.24: Intensity and F<sup>0</sup> of the single-contour type (39)

In (41), L is interviewing R about her study on differences among Japanese dialects. R utters 'eastern area' in a separate contour from the predicate because this is the only area where she found no differences between smaller areas (prefectures) when comparing different dialects. Therefore 'the eastern area' is contrasted with other areas. In this case, the F<sup>0</sup> peak and the intensity of the postposed element are as high as those of the main clause, as shown in Figure 5.26.

(41) R: kooiu such.and.such sa-ga difference-*ga* aru-ne-tte exist-fp-q iu-koto-wa say-thing-*wa* ie-nai say-neg zyootai-desi-ta-ne situation-cop.plt-past-fp **kantoo-no** east-gen **hoo-wa** direction-*wa* 'One cannot say that there is such and such difference, (in the) eastern area.' (D04F0050: 338.54-349.27)

### **5.3.2 Motivations for topics to appear post-predicatively**

It has been pointed out that topics or given elements tend to appear clause initially (Mathesius 1928; Firbas 1964; Daneš 1970). What are the motivations for them to appear post-predicatively? In this section I mainly discuss the postpredicate elements of the single-contour type in comparison with the elements before the predicate. Elements of the double-contour type are heterogeneous, as discussed above, and need further investigation.

### 5 Word Order

Figure 5.25: Intensity and F<sup>0</sup> of the double-contour type (40)

Figure 5.26: Intensity and F<sup>0</sup> of the double-contour type (41)

### **5.3.2.1 Low activation cost and general characteristics of intonation units**

Before getting directly into the question of why some topics appear post-predicatively, let us begin with the question of why some topics do not appear clauseinitially. As discussed in §5.2.1 and this section, the activation cost of preposed topics is higher than the activation cost of postposed topics and zero pronouns. The low activation cost of post-predicate elements suggests that they are not anchors to the previous discourse; since they are already sufficiently evoked, they do not have to relate to the previous context and the current utterance. Therefore, they have a motivation for not appearing clause-initially. Why do they appear post-predicatively?

### 5.3 Post-predicate elements

I argue that the element whose activation cost is low tends to appear postpredicatively because in Japanese and many other languages an intonation unit starts from a high F<sup>0</sup> and gradually declines toward the end (Liberman & Pierrehumbert 1984; Cruttenden 1986; Du Bois et al. 1993; Chafe 1994; Prieto et al. 1996; Truckenbrodt 2004; Den et al. 2010). Since the elements with low activation cost do not require a high F<sup>0</sup> , their preferred position is toward the end of the intonation unit. This kind of phenomenon has already been reported in Siouan, Caddoan, and Iroquoian languages of North America (Mithun 1995). In these languages, this newsworthy-first (i.e., given-last) word order is fully grammaticalized, and Mithun proposes the hypothesis that the given-last word order comes from right-detachment constructions, i.e., the postposed constructions discussed in this section. She argues that this word order is motivated by the general tendency of intonation units to form a high F<sup>0</sup> , which gradually declines. This tendency of intonation units is physiologically motivated, as Cruttenden (1986) discusses:

The explanation for declination has often been related to the decline in transglottal pressure as the speaker uses up the breath in his lungs. A more recent explanation suggests that an upward change of pitch involves a physical adjustment which is more difficult than a downward change of pitch, the evidence being that a rise takes longer to achieve than a fall of a similar interval in fundamental frequency. (Cruttenden 1986: 168)

Moreover, Comrie (1989: 89) argues that unstressed constituents such as clitic pronouns are cross-linguistically "subject to special positioning rules only loosely, if at all, relating to their grammatical relation"; therefore, he argues that "sentences with pronouns can be discounted in favour of those with full noun phrases". Arguing against the hypothesis (Givón 1979) that one can reconstruct the ancient word order of a language based on pronominal affixes and clitics, Comrie suggests that the order of these elements in a clause is more likely to be influenced by stress rhythm properties (Comrie 1989: 218).

I argue that the order of Japanese unstressed pronouns and NPs is also affected by phonetic constraints, as Comrie suggests. As will be discussed in Chapter 6, some unstressed pronouns and NPs referring to highly evoked entities have a decrease in pitch peak and are produced only in low pitch. However, an accent rule in Japanese forbids lexical items starting with two low pitch morae in a row. Therefore, the best position for unstressed items is the sentence-final or post-predicate position, in which unstressed items are allowed. For a phonetic analysis of unstressed items, see Chapter 6.

### 5 Word Order

### **5.3.2.2 Why the post-predicate construction mainly appears in dialogue and the source of its "emotive" usage**

The declination of F<sup>0</sup> does not fully explain post-predicate constructions in Japanese. The discussion above does not explain why the Japanese post-predicate construction mainly appears in dialogues, but not in monologues. Moreover, Japanese post-predicate constructions are reported to have "emotive" characteristics (Ono 2007). As examples for emotive characteristics of post-predicate constructions, consider the following constructed example. Let us assume that a boy gave a present to his girlfriend. The girl happily received the gift and opened it. After seeing the gift, say a banana case,<sup>10</sup> she uttered (42) or (43). Since the most frequent word order in Japanese is predicate-final, the canonical order is the one in (42), whereas (43) can be regarded as a post-predicate construction.


These two utterances consist of the same constituents *kore* 'this' and *nani* 'what'. As was pointed out in Ono & Suzuki (1992) and Ono (2007), however, the implicatures of these two are different. In (42), she simply does not know what she received, probably because she has never seen it before. By contrast, in (43), she knows what she received (it's a banana case) but she did not like it, as we expected. In other contexts, (43) can be used to express the speaker's surprise, excitement, etc. However, (43) can never be a neutral question. Where does this implicature come from?

Since these two utterances consist of exactly the same elements, it is obvious that the implicature in (43) cannot be derived from the meaning of each constituent. In this study, I propose that two factors are involved in the questions why post-predicate constructions mainly appear in dialogues and what the source of this "emotive" usage is: word order and intonation.

Firstly, I discuss why the post-predicate construction appears mainly in dialogues. My point is that, since the intonation-unit-final position is a position for expressions with interactional functions, the post-predicate element (of the

<sup>10</sup>Bananas of all sizes can fit into this banana case.

### 5.3 Post-predicate elements

single-contour type) plays some interactional role. As has traditionally been argued (e.g., Watanabe 1971), the post-predicate position is for interaction in Japanese. Iwasaki (1993) extended this argument and claimed that in fact the intonation-unit-final position is the position for interaction; the post-predicate position is only one example of this intonation-unit-final position. Consider the following example. Each line corresponds to a single intonation unit. The lines a, b, and c end with the interactional markers *ne* and *sa*, which is indicated by **IT**. As the examples in (44) show, these interactional markers appear IU-finally.<sup>11</sup>

	- b. sinin-o corpses-*o* ID asoko-e there-dir ID minna-**ne** all-fp ID-**IT**
	- c. ano that ID dote-no bank-gen ID ue-e-**sa** top-dir-fp ID-ID-**IT** atsume-te

gather-and ID-CO

'gathered dead bodies on top of that bank...' (Iwasaki 1993: 47, gloss and transcription modified by the current author)

As Morita (2005) suggests, a general function of interactional particles such as *ne* and *sa* is "to foreground a certain stretch of talk as an 'interactionally relevant unit' to be operated on – whether that unit is itself a whole utterance or merely one particular component of that utterance" (p. 92). Since post-predicate elements follow these interactional particles within the same intonation unit – as in (32) and (37), where the post-predicate elements follow *ne* – they are also expected to have some interactional function. Guo & Den (2012) report that 77.6% of post-predicate constructions have interactional particles of this kind after the predicate, whereas only 47.0 % of non-post-predicate constructions have interactional particles. This also suggests that post-predicate constructions are related

<sup>11</sup>IT stands for "interactional component", one of the four component types in an intonation unit. Other types are: LD (lead component (e.g., fillers)), ID (ideational component), and CO (cohesive component). The order of an intonation unit is proposed to be LD ID CO IT in Japanese (Iwasaki 1993: 44).

### 5 Word Order

to some interactional characteristics. Further investigation is necessary to uncover what kind of interactional functions they have, possibly employing conversational analysis.

Secondly, I argue that the source of the "emotive" implicature of (43) in contrast with (42) comes from an intonational constraint on the post-predicate element. In Japanese, *wh*-questions can optionally be uttered with rising intonation. However, the post-predicate element is always falling and the rising intonation is not natural. Figure 5.28 shows the pitch contour of the utterance *nani kore* 'what's this (weird thing)?' (43), while Figure 5.27 shows the pitch contour of the neutral order *kore nani* 'what's this?' (42). As indicated in the figures, the neutral word order (42) in Figure 5.27 is uttered with rising intonation, and I believe that this is the most frequent intonation, whereas the post-predicate construction (43) in Figure 5.28 has a falling intonation, in which case it is impossible to utter *kore* with rising intonation. It is this constraint on the intonation of post-predicate elements that yields the emotive implicature of the utterance in (43). In fact, the neutral word order *kore nani* can be uttered with falling intonation, as shown in Figure 5.29. In this case, as predicted from the discussion, the falling intonation conveys the emotion of the speaker. It is possible for *nani* 'what' in (43) to be uttered with rising intonation as indicated in Figure 5.30, in which case the emotive nuance of (43) disappears.

Figure 5.27: Pitch contour of *kore nani* (42) with rising intonation

### **5.3.2.3 Post-predicate elements of the double-contour type**

Finally, in this section, I briefly review some intriguing studies on post-predicate constructions which I assume belong to the double-contour type. The first study is Guo & Den (2012). They investigated whether the hearer responds (including back-channel responses) to the speaker near and after the predicate, and

### 5.3 Post-predicate elements

Figure 5.28: Pitch contour of *nani kore* (43)

Figure 5.29: Pitch contour of *kore nani* (42) with falling intonation

Figure 5.30: Pitch contour of *nani kore* (43) with rising intonation of *nani*

### 5 Word Order

showed that the speaker adds post-predicate elements when the hearer does not respond to the predicate. Their further analysis suggests that the speaker produces post-predicate elements to get a response from the hearer and to achieve mutual belief. Let us see example (45), which comes from the dialogue part of CSJ they employed. The duration of silences is shown in seconds inside parentheses, since it is important for the discussion. In (45–L2), where the speaker postposes the element *kono kenkyuu* 'this study', there are pauses between the verb phrase and the postposed demonstrative *kono* 'this', as well as between the demonstrative and the postposed NP *kenkyuu* 'study', which is enough time for L to realize that R does not respond to L. Note that R, the listener of the postposed construction, does not respond until second 604.33, 0.32 seconds after L finished the post-predicate part. Also note that these pauses differentiate post-predicate constructions of the double-contour type from those of the single-contour type.


Tanaka (2005) investigates postposed and preposed constructions in terms of interactional structures: preferred vs. dispreferred structures. See the discussion in §2.4.3.3 for detail.

5.4 Pre-predicate elements

### **5.3.3 Summary of post-predicate elements**

In this section I investigated post-predicate elements. It turned out that the activation cost of postposed elements is much lower than that of preposed elements, elements that appear before the predicate. This suggests that topics also appear post-predicatively. I also discussed why topics appear post-predicatively as well as clause-initially in terms of the shape of intonation and its constraints on Japanese grammar.

The characteristic found in this study is one of many features of post-predicate elements. In future research, it is necessary to explore how these features are related to each other.

### **5.4 Pre-predicate elements**

This section discusses pre-predicate elements, elements which appear immediately before the predicate. In §5.4.1, I show results that indicate that new, i.e. focus, elements tend to appear right before the predicate. In §5.4.2, I discuss reasons for why focus elements appear near the predicate.

### **5.4.1 New elements appear right before the predicate**

As shown in Figure 5.2 and 5.5, repeated here as Figure 5.31 and 5.32 for convenience, new elements or focus elements tend to appear immediately before the predicate. Figure 5.31 shows the element position based on information status including all expressions such as fillers, adjectives, etc.; Figure 5.32 shows the distance between each of the elements and the predicate based on their information status. As indicated in Figure 5.31, the distribution of anaphoric elements is skewed towards clause-initial position, whereas that of non-anaphoric elements is not. Drawing from Figure 5.32, we can also see that many new elements appear immediately before the predicate. As discussed in 5.1, the mixed effects model of information status (the distance between the predicate and the element in question) shows that the contribution of distance is only marginally significant. However, a further analysis in this section shows that distance is also a significant factor for predicting information status. As is clear from Table 4.3 and 4.4, datives tend to code new elements (especially, as opposed to *wa*). Datives can appear anywhere, from pre-predicate to clause-initial positions, which is shown in Figure 5.33. Therefore, I tentatively conclude that the distance between the predicate and the element in question (excluding *ni*-coded elements) is an important factor for information status, and that new elements appear before the predicate.

### 5 Word Order

Figure 5.31: Word order vs. information status

This supports a classic observation from other languages that focus appears close to the predicate (Bresnan (1994); Morimoto (1999) on Bantu languages, Jacennik & Dryer (1992) on Polish, Erguvanli (1984) on Turkish, see Morimoto (2000) for a summary of studies on both VO and OV languages). Further studies are necessary to obtain conclusive evidence.

The following are examples of non-anaphoric elements appearing close to the predicate. (46) and (47) are examples of non-anaphoric P occurring immediately before the predicate. In (46), *kyoomi* 'interest' appears immediately before the predicate *moti* 'have', and, in (47), *aidenthithii* 'identity' in line a, *inoti* 'life' in line b, and *ti* 'blood' in line c appear right before the predicates *kake* 'risk' and *nagasi* 'bleed', respectively. Non-anaphoric Ps are typically abstract concepts like *kyoomi* 'interest' in (46), *aidenthithii* 'identity' in (47-a), and *inoti* 'life' in (47-b), or indefinite like *ti* 'blood' in (47-c).

(46) de then ee fl sono fl ri-too-no remote-island-gen hoo-ni direction-dat sono fl **kyoomi-o** interest-*o* moti have

Figure 5.32: Distance from predicate vs. InfoStatus

Figure 5.33: Distance from predicate vs. grammatical function

hazime-masi-te start-plt-and '(We) are starting to be interested in remote islands (in Hawaii).' (S00F0014: 149.92-153.33)

(47) a. tasuu-no many-gen serubia-zin-ga Serbia-people-*ga* minzoku-no ethnic-gen ee fl **aidenthithii-o** identity-*o* kake-te risk-and 'Serbian people bet their identity, and' b. **inoti-o** life-*o* kake-te risk-and 'risked their lives, and' c. **ti-o** blood-*o* nagasi-ta-to bleed-past-q iu say 'bled (in battles),' d. rekisi-ga history-*ga* ee fl sono-go that-later tenkai progress s-are-masu do-pass-plt 'history went on this way.' (S00M0199: 343.53-351.77)

Non-anaphoric S elements also appear immediately before the predicate. They tend to be abstract or indefinite like non-anaphoric Ps. In (48), *kanzi* 'impression', an abstract concept, is the only argument of the predicate *tigau* 'differ' and is therefore S. This element appears immediately before the predicate.


In (49), *hito* 'person' is indefinite and appears before the predicate.

(49) naka-ni-wa inside-dat-*wa* byooin-okuri-ni hospital-send-to naru become **hito**-mo person-also i-masi-ta-kedomo exist-plt-past-though

5.4 Pre-predicate elements

'Some people were sent to the hospital (lit. People who were sent to the hospital also exist).' (S05M1236: 578.30-581.49)

### **5.4.2 Motivations for a focus to appear close to the predicate**

I argue that the information-structure continuity principle (10) is also at work here, which is repeated below as (50) for the purpose of convenience.

(50) **Information-structure continuity principle**: A unit of information structure is continuous in a clause; i.e., elements which belong to the same unit are adjacent to each other.

I assume that the predicate is most frequently in the domain of focus (Lambrecht 1994), optionally with one focus element. Since the predicate and the new element are in the same domain of focus, they also appear together most frequently.

In fact, only few studies pay attention to the information status (and namely information structure) of predicates.<sup>12</sup> Unfortunately this study is not an exception. Typically, definite markers such as *the* in English and *der* in German attach to nouns, not to verbs. Also topic markers such as *wa* in Japanese typically attach to nouns. Therefore, nouns have attracted more attention than verbs. Typically verbs are followed by tense or aspect markers, subordinate-clause markers, realis vs. irrealis markers, and so on. I believe that these verbal markers are also related to information structure, but this is beyond the scope of this study.

However, it is obvious that argument-focus structure, where the predicate is not in the domain of focus, is the least frequent type among all three types of focus constructions (predicate-focus, sentence-focus, and argument-focus structures). Given that the corpus employed in this study consists of monologues, it is to be expected that there are even fewer examples of argument-focus structures because these structures typically appear as the answer to a who/what question, as shown in (51), where the capital letters indicate prominence.

(51) Q: Who went to school? A: [The CHILDREN] [went to school] . (Lambrecht 1994: 121)

Since there are no (explicit) questions in monologues, we find fewer argumentfocus structures.

Another context in which sentences with argument-focus structure appear is the "A not B" context. In monologues, "A not B" contexts typically appear in selfrepair, which is also rare in our relatively smooth monologues. Therefore, it is

<sup>12</sup>Hopper & Thompson (1980) is an important exception.

### 5 Word Order

not unreasonable to assume that the predicate is in the domain of focus most of the time, and I argue that the information-structure continuity principle (50) explains why new elements (i.e., focus elements) tend to appear immediately before the predicate.

One piece of evidence that supports the information-structure continuity principle is the fact that it is difficult for presupposed elements to appear immediately before the predicate, interrupting the focus domain. Compare (52-A) and (52-A′ ), which are answers to the question in (52-Q).<sup>13</sup> In (52-A), the presupposed elements *taroo-ni* 'to Taro' and *hanako-ni* 'to Hanako' are interrupting the domain of focus 'gave a travel ticket' and 'gave a cake'. Therefore this sentence is not acceptable. Conversely, in (52-A′ ), the presupposed elements do not intervene the domain of focus and therefore the answer is acceptable.

	- A: ?[ryokoo-ken-o] travel-ticket-*o* [taroo-ni] Taro-dat [age-te] give-and [keeki-o] cake-*o* [hanako-ni] Hanako-dat [tukut-te make-and age-ta] -yo give-past-fp
	- '(I) gave travel tickets to Taro and gave cake to Hanako.' A ′ : [taroo-ni] Taro-dat [ryokoo-ken travel-ticket age-te] give-and [hanako-ni] Hanako-dat [keeki cake tukut-te make-and age-ta] -yo give-past-fp

'(I) gave Taro travel tickets and gave Hanako cake.'

A more natural context for (52-A) is one where Q asks what A did for the travel ticket and the cake. Kuno (1978) proposes that the pre-predicate position is for new elements, but he limits this principle to cases where the predicate is given.

(53) In cases where the predicate is given, the position immediately before the predicate is the position for new. (Kuno 1978: 60, translated by the current author)

I argue that this observation also applies to cases where the predicate is new.

Moreover, as will be discussed in Chapter 6, the domain of focus is uttered in a single intonation unit, whereas the topic is uttered separately from the domain of focus. Figure 5.34 to 5.37 show the pitch contours of examples (47) and (48)

<sup>13</sup>Note that they are not a perfect minimal pair because of the accusative marker of *o*. The presence or absence of *o* is determined by word order, and information structure is a kind of side effect in this case. See the discussion in §4.3 for more detail.

### 5.4 Pre-predicate elements

we discussed in the last section. As we can see, there is no pause between the predicate and the previous element, and the pitch range is larger in the elements than in the predicates. In Figure 5.36, it is difficult to see the pitch range because *ti* 'blood' does not have accent nucleus. From the first lowering of *na* in *nagasi-ta* 'bled' being cancelled,<sup>14</sup> one can see that *ti-o* 'blood-*o*' and *nagasi-ta* 'bleed' form a single intonation unit.

Figure 5.34: Pitch contour of a in (47)

Figure 5.35: Pitch contour of b in (47)

### **5.4.3 Summary of pre-predicate elements**

The results of this section showed that new elements, namely focus elements, tend to appear right before the predicate. A similar claim has been made by Kuno (1978) and Endo (2014) through constructed examples. This study supported their claim by examining naturally occurring utterances. I also discussed explanations why the focus appears right before the predicate.

<sup>14</sup>The pitch accent of *nagasi-ta* is LHLL.

### 5 Word Order

Figure 5.36: Pitch contour of c in (47)

Figure 5.37: Pitch contour of b in (48)

### **5.5 Discussion**

This section first discusses possible confounding effects on word order in Japanese, in particular in association with basic word order (§5.5.1). Second, I discuss Givón's topicality hierarchy (§5.5.2). I provide some counter-examples to this hierarchy and propose modifications to it. Finally, I discuss the implications of this study's findings as regards word order typology (§5.5.3).

### **5.5.1 Possible confounding effects**

It is necessary to take other features into account to see the exact effect of topichood and focushood on word order. Especially, the effect of "basic word order" should not be ignored. Here I provide some evidence to support my argument that information structure contributes to word order in spoken Japanese. Figures 5.38 to 5.41 show the word order and information status of each type of grammatical function (A, S, P, and dative). These figures indicate that anaphoric elements

Figure 5.38: Word order of A

Figure 5.39: Word order of S

Figure 5.40: Word order of P

Figure 5.41: Word order of dative

### 5.5 Discussion

of all grammatical function types are still more likely to appear earlier in a clause than new elements. A and S are more likely to appear earlier in a clause than P because of the basic word order. However, my argument still holds for the same grammatical function types. In cases with new elements, one can see the effect of basic word order; the peak of S is 4, which means the 4th position is the most popular for new S (Figure 5.39), whereas the peak of P is 6, which means the 6th position is the most popular for new P (Figure 5.40). The distribution of A is not clear because there are few examples. But the trend still seems to hold for A.

### **5.5.2 Givón's topicality hierarchy and word order**

Givón (1983) proposes a hierarchy of topicality, shown in (54) (terminology modified by the author). "RD" refers to referential distance, which is one of the approximations to measure topicality. Low RD means high topicality, while high RD means low topicality.

	- a. Referential indefinite NPs
	- b. Cleft/focus constructions
	- c. Y-moved NPs ('contrastive topicalization')
	- d. Preposed definite NPs
	- e. Neutral-ordered definite NPs
	- f. Postposed definite NPs
	- g. Stressed/independent pronouns
	- h. Unstressed/bound pronouns or grammatical agreement
	- i. Zero anaphora
	- ↓ Low RD (Givón 1983: 7)

Here I point out two counter-examples to this hierarchy. First, as has already been shown in Table 5.4 and 5.5, which are repeated as Table 5.6 and 5.7 for convenience, the average RD of elements in the clause-initial position (20.9) is lower than that in the second (23.0) or third position (41.1). To see this more in detail, I divided the results of Table 5.7 on the basis of grammatical function. This is shown in Table 5.8. Regardless of whether the element is A, S, or P, the overall tendency is that the elements closer to the predicate have a higher average RD.<sup>15</sup> The topicality hierarchy in (54) predicts that clause-initial elements (d in (54)) have a lower RD than elements in the neutral-ordered position (e in (54)).<sup>16</sup>

<sup>15</sup>For now I do not have an explanation for S in the second position. It is necessary to test whether the difference between Ss in the first and the second positions is statistically significant or not.

<sup>16</sup>I assume that all elements that have antecedents (and therefore also RDs) are definite.

### 5 Word Order

Especially P is against the topicality hierarchy in (54), according to which P in the second or third positions should have a lower RD than P in the first position, since the neutral position of P in Japanese is the second or third position. However, this is not the case. At least in Japanese, the data show that elements closer to the predicate have higher RDs because the pre-predicate position is for focus and hence for new elements.

Table 5.6: RD of post-predicate elements


Table 5.7: RD of pre-predicate elements (based on argument order)


Table 5.8: RD of pre-predicate elements (based on grammatical function)


Second, the average RD of zero pronouns is as high as that of postposed NPs according to Table 5.9 and 5.10. This is against the topicality hierarchy in (54), which states that preposed definite NPs (d in (54)) and neutral-ordered definite NPs (e in (54)) have higher RDs than postposed definite NPs. As discussed above, elements are postposed for interactional purposes and/or intonational reasons.

The final point is an additional suggestion for (54) rather than a counter-example. The RD of postposed elements of the double-contour type is much higher than Givón predicts. As will be argued in Chapter 6, a unit of information structure corresponds to a unit of intonation. Since postposed elements of the singlecontour type by definition belong to the same intonation unit as the main predicate, the predicate and the postposed element form a single unit (construction)

### 5.5 Discussion

Table 5.9: RD of postposed elements of the single-contour type (based on expression type)


Table 5.10: RD of pre-predicate elements (based on expression type)


and postposed elements are relatively homogeneous and easy to characterize. However, postposed elements of the double-contour type are heterogeneous, as discussed above, and they are difficult to characterize because the element itself corresponds to a single unit. There are different reasons why such elements are uttered. The function of these postposed elements is determined by the sequence of conversation.

### **5.5.3 Information structure and word order typology**

Since focus elements are most frequently patients according to the correlating features in (2), which is repeated here as (55), the information-structure continuity principle in (10) predicts that, cross-linguistically, P (the patient-like argument in a transitive clause) and V (the predicate) tend to appear together most frequently and, if the word order is fixed in the language in question, P and V tend to appear together.


In fact, this has already been claimed and tested in Tomlin (1986: Chapter 4). Tomlin proposes this claim in terms of the Verb-Object Bonding.

### 5 Word Order

(56) **Verb-Object Bonding (VOB):** the object of a transitive verb is more tightly bounded to the verb than is its subject. (Tomlin 1986: 74)

He also states that "[e]xactly why there should be such a bond between a transitive verb and its object is not entirely clear" (ibid.). I propose the informationstructure continuity principle as the motivation for such bond. He enumerates many cross-linguistic pieces of evidence that support VOB. I introduce a few of them to keep the discussion simple.

First, in many languages, there exists some clause-level phonological behavior (reductions or sandhis) which occur between object and verb, but not between subject and verb (op. cit., p. 97). In French, for example, liaison does not occur between the subject and a transitive verb, but it does between the object and the verb (see also Selkirk 1972). There is no liaison between the subject *les gens* and the verb *achètent* in (57), whereas there can be liaison between the verb *donnerons* and the object *une pomme* in (58).


Another case is Yoruba (Niger-Congo) vowel deletion (from Bamgbose 1964). In verb-noun sequences in this language, when the object begins with a vowel, the last vowel of the verb is sometimes deleted. This happens between verb and object, but not between subject and verb.

(59) a. gb´e brought + + od´o motor → gb'´od´o b. jE eat iy´On pounded.yam → j'iy´On c. Se do `ow`o trade → S'`ow`o

(Bamgbose 1964: pp. 29–30)

### 5.5 Discussion

These phonological phenomena in French and Yoruba suggest that the object and predicate are bound more tightly than the subject and predicate. In a similar manner, in Japanese the focus element and the predicate form a single intonation unit, but the topic element and the predicate do not, as we will see in Chapter 6.

The second piece of evidence that supports VOB is noun incorporation. In Mokilese (Oceanic), for example, there is a set of verbs into which an indefinite object may be incorporated (from Harrison 1976). (60-a) is a transitive clause with a definite object, which is not incorporated into the verb, whereas (60-b) is a clause with an indefinite object, which is incorporated into the verb. Note that the incorporated object *rimeh* 'bottle' in (60-b) is between the verb and the aspect suffix *la*.


'I filled bottles.' (Harrison 1976: 162)

Similarly, compare (61-a) and (61-b). (61-a) is a case where the object *suhkoah* 'tree' is definite and is not incorporated, while (61-b) is a case where the object is indefinite and is incorporated into the verb.


As Mithun (1984) observes, in some languages patient Ss can also be incorporated into verbs, but languages that allow patient S-incorporation also allow Pincorporation (see also Baker 1988): there is a universal hierarchy as in (62). The last two (agent S and A) are in brackets because they are not attested.

(62) P > patient S (> agent S > A)

In Southern Tiwa (Tanoan), for example, the patient Ss 'dipper' and 'snow' are incorporated in (63), while agent Ss such as 'dog' cannot be incorporated as in (64).

5 Word Order


b. \*Ø-khwien-teurawe-we A-**dog**-run-pres

'The dog is running.' (agent S) (Allen et al. 1984; Baker 1988)

In Japanese, Kageyama (1993) reports that patient S and P (in his terminology, internal arguments) are widely incorporated into verbs and form noun-verb compounds. He also reports the existence of agent S and A (external arguments) incorporated into verbs, but claims that they are exceptional. The hierarchy of noun incorporation (62) is similar to the hierarchy of zero-marking in Japanese. This is because they are both hierarchies based on focus structure (see also §7.3).

Finally, VOB and the information-structure continuity principle with correlating features of information structure (2) predict that cross-linguistically, P and V appear together most frequently. Table 5.11shows the order of subject (S in the table, A in our terminology), object (O in the table, P in our terminology), and verb (Dryer 2013c). "[O]ne order is considered dominant if text counts reveal it to be more than twice as common as the next most frequent order; if no order has this property, then the language is treated as lacking a dominant order for that set of elements " (Dryer 2013a). The table shows that SOV and SVO are the most popular dominant word orders among all other possibilities as predicted, while the next popular order is VSO, which is against our prediction. However, note that in deciding which word order is dominant in a language, Dryer included only "a transitive clause, more specifically declarative clauses in which both the subject and object involve a noun (and not just a pronoun)" (Dryer 2013c). Therefore, this dominant word order might not be that of predicate-focus structure. Since both of the full noun phrases can be new, the clause have a sentence-focus structure. Dryer (1997) (as well as Dryer 2013c) points out that transitive clauses with full lexical nouns do not occur frequently; it is more common that one of the two arguments is pronominal, which is more likely to have a predicate-focus structure.

5.6 Summary

For now, a cross-linguistic examination of word orders controlling information structure is very difficult and I leave this problem for future studies.


Table 5.11: Order of subject, object, and verb (Dryer 2013c)

### **5.6 Summary**

### **5.6.1 Summary of this chapter**

This chapter analyzed associations between word order and information structure in spoken Japanese. I made it clear that shared topics appear clause-initially, while strongly evoked topics appear post-predicatively. Also, new, i.e., focus, elements appear immediately before the predicate. Based on these findings, I proposed the information-structure continuity principle, in addition to the from-oldto-new principle and the persistent-element-first principle.

### **5.6.2 Remaining issues**

As I briefly discussed in §5.5.1, information structure is not the only feature contributing to word order in spoken Japanese. It is necessary to employ statistical analyses including other features to investigate the effect of information structure..

### **6.1 Introduction**

This chapter investigates the relation between information structure and intonation units. I propose that an intonation unit corresponds to a chunk of information, which often corresponds to a unit of information structure. I employ two methods: one is the corpus study that I have employed in the previous chapters (§6.2), and the other is a production experiment, where I ask native speakers of Japanese to read aloud sentences and measure the F<sup>0</sup> of their speech (§6.3). From corpus findings and the results of the experimental study, I propose principles governing intonation (§6.4).

Before going into the analyses, I discuss the two types of intonation unit (IUs) investigated in this study: phrasal IU and clausal IU. For the definition of intonation units, see §2.4.4.

I assume that there are many factors determining IUs and it is impossible to investigate all of them. To study information structure factors determining IUs, I distinguish two types of intonation units: the phrasal IU and the clausal IU. A phrasal IU is an IU where an element (an NP of any grammatical function) is uttered in an IU separate from its predicate, whereas a clausal IU is an IU where an element is uttered in the same IU as its predicate. IUs where elements themselves are predicates are excluded from the analysis. Phrasal and clausal IUs are schematized in (1), where an IU corresponds to a box.

(1) a. Phrasal IU: NP Predicate b. Clausal IU: NP Predicate

The motivations for this distinction come from the observation that IUs in Japanese are more frequently units smaller than a clause (Iwasaki 1993), while IUs in English often correspond to a clause (Chafe 1994). This distinction is also employed in Matsumoto (2003: Chapter 4), who investigated intonation units in Japanese in terms of information flow. (2) is an example of a Japanese IU, where a single line corresponds to a single IU.


Iwasaki states that IUs like those in (2) are typical in Japanese. An IU corresponds to a phrase rather than a clause. Note that the definitions of IU in Iwasaki (1993) and Matsumoto (2003) are different from those employed in this study, which are taken from Den et al. (2010) and Den et al. (2011), even though they share some similarities. In the particular example in (2), most IUs end with the discourse particle *ne*, which often appears IU-finally also in the criteria of Den et al.

### **6.2 Intonation unit and unit of information structure: corpus study**


Table 6.1: IU vs. information status

This section explores the associations between IUs and information structure by investigating our corpus. I will argue that, in general, topics tend to be uttered

Figure 6.1: IU vs. information status

Figure 6.2: IU vs. persistence


Table 6.2: IU vs. Persistence

in phrasal IUs (§6.2.1), while foci tend to be produced in clausal IUs (§6.2.2). I also discuss exceptional cases for each tendency.

Table 6.1 and Figure 6.1 show the distribution of phrasal vs. clausal IUs in different information statuses (anaphoric vs. non-anaphoric). The term "anaphoric" refers to elements whose referents have been mentioned in the previous discourse, whereas "non-anaphoric" refers to elements whose referents have newly been mentioned (see 3.4.3.3 for more details on the annotation procedure ). A linear mixed effects model was employed to predict information status, as we have seen in §4.2 and §5.1. Intonation (phrasal vs. clausal IU), particles (*toiuno-wa, wa, mo, ga, o, ni*), and word order (nth in CSJ, see §5.1 for the definition of this annotation) are included as fixed effects, and the speaker (TalkID in the corpus) is included as a random effect. The model with the effects of intonation, particles, and word order is significantly different from that without each of them (likelihood ratio test, < 0.05 without intonation, < 0.001 a model without particles, and < 0.01 that without word order).

Table 6.2 and Figure 6.2 show the distribution of phrasal vs. clausal IUs in terms of persistence (persistent vs. non-persistent). Persistent elements are those whose referents are to be mentioned again in the following discourse, whereas non-persistent elements are those whose referents will not be mentioned again. Again, a linear mixed effects model was applied to predict persistence, as discussed in §4.2 and §5.1. Intonation, particles, and word order are included as fixed effects and speaker as a random effect. The model with the effects of particles, word order, and intonation is not significantly different from that without the effect of intonation ( = 0.423), whereas it is significantly different from the model without each of the effects of particles and word order (likelihood ratio test, < 0.001 a model without particles, < 0.01 that without word order).

### 6.2 IU and IS unit: corpus study

### **6.2.1 Topics tend to be uttered in phrasal IUs**

This section and the next section discuss associations between topics and IUs and argue that evoked, inferable, declining and unused topics tend to be uttered in phrasal IUs (§6.2.1.1, 6.2.1.2). I also claim that some strongly evoked topics, especially pronouns, are in fact part of the following IU and should be counted as clausal by modifying the definition of IUs (§6.2.1.3). It also discusses exceptional cases where topics appear in clausal IUs (§6.2.1.4). I will argue that topics to be established tend to be uttered in phrasal IUs (§6.4).


Table 6.3: Intonation unit vs. particles

Figure 6.3: Intonation unit vs. particles

### **6.2.1.1 Evoked, inferable, and declining elements with topic markers in phrasal IUs**

As indicated by Table 6.1, Figure 6.1, and the results of statistical analysis, anaphoric elements are more likely to be uttered in phrasal IUs. Also, Table 6.3 and Figure 6.3 show that elements with topic markers such as *toiuno-wa* and *wa* are more likely to be in phrasal IUs than those with case markers. Elements with topic markers are uttered in phrasal IUs most of the time, while the ratio of elements with case markers (without topic markers) in clausal IUs is larger. These observations indicate that at least evoked and inferable topics tend to be produced in phrasal IUs. This conclusion results from the observation that elements coded by topic markers such as *toiuno-wa* and *wa* are evoked or inferable elements as argued in Chapter 4. Below I show that declining elements are also uttered in phrasal IUs. I will argue that strongly evoked elements, especially pronouns, are in fact part of the following IUs, although under the current criteria they are included in phrasal IUs, and should be counted as phrasal IUs in §6.2.1.3.

Figure 6.4: Pitch contour of (3)

Figure 6.5: Pitch contour of (4)

(3) exemplifies an evoked element with a topic marker uttered in a phrasal IU ("Ş" indicates IU boundaries). In this talk, the speaker is talking about his former

### 6.2 IU and IS unit: corpus study

Figure 6.6: Pitch contour of (5)

Figure 6.7: Pitch contour of (6)

job, collecting debt from people. There is an IU boundary after *kaisyuu hoohoowa* 'collecting method-*wa*', the element coded by a topic marker. *kaisyuu hoohoo* 'collecting method' is evoked because it is mentioned in the immediate context, as indicated by *koo it-ta* 'this way of'.


Figure 6.4 shows the pitch contour of (3). In the figure, one can observe a pitch reset in the first mora of the predicate *mazui* 'wrong'.

(4) is another example, where the speaker is talking about his dog, who had epilepsy. There is an IU boundary after *byooki-wa* 'disease-*wa*'. *Byooki* 'disesase' is also evoked because it is mentioned in the immediate context as indicated by the demonstrative *sono* 'that'.

(4) sono that **byooki-wa** disease-*wa* Ş kokuhuku overcome si-masi-te do-plt-and Ş '(The speaker's dog) overcame that disease.' (S02M0198: 480.52-482.47)

The pitch contour of (4) is shown in Figure 6.5. In the figure, one can observe not only a pitch reset, but also falling intonation, which typically occurs IU-finally.

(5) is an example of a *toiuno-wa*-coded element uttered in a phrasal IU. The pitch contour is shown in Figure 6.6. *Hawai-too* 'Hawaii island' is also evoked, as is clear from the demonstrative *kono* 'this'.

```
(5) de
 then
      kono
      this
           Ş hawai-too-tteiuno-wa
             Hawaii-island-toiuno-wa
                                      Ş don'na
                                        how
                                               tokoro-ka-tte
                                               place-q
 ii-masu-to
 say-plt-cond
               Ş
 'What kind of place is this Hawaii island?' (S00F0014: 166.53-169.71)
```
As shown in the figure, one can observe the pitch reset in the first mora of the predicate *don'na* 'how'.

Similarly, the inferable element *yomee-wa* 'life.expectancy-*wa*' is produced in a phrasal IU, as indicated in Figure 6.7. *Yomee* 'life.expectancy' is inferable because the speaker is talking about her disease and it is reasonable to assume that life expectancy is part of the knowledge about diseases.

```
(6) osoraku
 probably
          ‖ yomee-wa
           life.expectancy-wa
                              ‖ zyuu-nen
                               ten-cl.year
                                          ‖ -da-to
                                           -cop-quot
                                                      ‖
 iwa-re-masi-ta
 say-pass-plt-past
'(I) was told that (my) life expectancy was 10 years.' (S02F0010:
 312.22-314.91)
```
Declining elements are also produced in phrasal IUs rather than clausal IUs. Consider the following example. In (7), two competing topics, *meisei* 'fame' and *sigoto* 'work', are introduced in line a. Then, the speaker starts to talk about fame first and moves onto 'work' in line g, where the topic *sigoto* 'work' is considered to be declining. In this case, there is an intonation-unit boundary after *sigoto-no bubun-na-n-desu-keredomo* 'concerning the other one, work'.

	- b. Concerning fame,
	- c. I have been participating in various piano competitions
	- d. So far the best award I received was the fourth best place in the China-Japan International Competition.
	- e. Beyond that, I would like to receive higher awards.

6.2 IU and IS unit: corpus study


### **6.2.1.2 Unused elements with topic markers in phrasal IUs**

Unused elements with topic markers also tend to be uttered in phrasal IUs. Elements coded by a copula plus *kedo* or *ga* appear in phrasal IUs most of the time. For example, in (8-a), the element *sutairu* 'style', which is introduced for the first time, is produced in a phrasal IU.<sup>1</sup>



Similarly, in (9-a), *kandoo* 'emotion' is mentioned for the first time and is produced in a phrasal IU.

	- b. um, the Himalaya Mountains have a very unique shape I've never seen before,
	- c. Actually, local people call them holy mountains,
	- d. hm, somehow their shapes are sacred. (S01F0151: 460.73-477.82)

Readers might speculate that these elements appear in phrasal IUs because they are long expressions. However, the examples in the experimental study in §6.3 that force the speakers to assume that the topics are unused are short expressions (one word). The experiment shows that these short unused topics are still produced in phrasal IUs.

<sup>1</sup> In fact, the predicate of 'style' is not clear in this example. This is a general characteristic of topics. See discussion in §4.4.3 for more detail.

### **6.2.1.3 Strongly evoked elements in clausal IUs**

Figure 6.8: Anaphoric distance vs. expression type (all)

I propose that strongly evoked elements, usually pronouns coded by topic markers, are uttered in clausal IUs, although they are categorized into phrasal IUs by the current definition. Because strongly evoked elements tend to be uttered in a low pitch and with a smaller pitch range than the following accentual phrase, they are likely to be counted as phrasal IUs. However, I argue that they should be regarded as clausal IUs. The number of pronouns is very small, which does not influence the overall tendency in Figure 6.3 and Table 6.3 and hence this change does not affect the conclusion proposed in the last section. The claim that pronouns are strongly evoked elements is supported in Figure 6.8, repeated from Figure 4.7, which shows the time difference between the time when the first mora of the element in question is produced and the time when its antecedent is produced. This is assumed to approximate the activation cost of elements. As indicated in the figure, pronouns have an activation cost as low as zero pronouns.

First, I show examples of strongly evoked elements and their pitch contours. These pitch contours are different from those of evoked elements we have seen in the previous section. (10) is one of the few examples from the corpus of the current study, CSJ, whose pitch contour is shown in Figure 6.9. The IU boundary

6.2 IU and IS unit: corpus study

"‖" is inserted based on the current definition. I argue that there is no boundary after *sore-wa* 'that-*wa*'.

(10) **sore-wa** that-*wa* Ş nan-daroo-to what-cop.infr-quot omot-te think-and Ş '(I) was wondering what it was...' (S00F0014: 654.06-655.18)

Since the number of pronouns is small in the current corpus, I provide examples from another corpus. Examples (11) and (12) are from *the Chiba three-party conversation corpus*, which is a corpus of three people's casual conversation (Den & Enomoto 2007). Their pitch contours are shown in Figures 6.10 and 6.11 respectively. Again, the IU boundary is inserted based on the current definition that I challenge.


As shown in Figures 6.9–6.11, there is neither a pause nor vowel lengthening, which is often observed IU-finally. Moreover, the accent nucleus is not clearly observed in these pronouns. This suggests that the phrasal IU of evoked elements coded by topic markers and that of strongly evoked elements are qualitatively different. Since strongly evoked elements are already evoked and do not need to attract the hearer's attention, they are uttered with a lower pitch. When they are followed by the predicate, which is typically not evoked and needs to attract the hearer's attention, the predicate is uttered with a higher pitch, which causes a pitch reset.

I challenge the claim that this type of strongly evoked element actually forms a single chunk of processing. First, in addition to the qualitative difference between phrasal IUs of evoked elements and of those of strongly evoked elements, the transition from an IU with a single strongly evoked element such as *are* and *sore* in Figure 6.9–6.11 to the next is too fast for the speaker to plan the next utterance, assuming that an IU represents some kind of processing unit. This suggests that the current element and the following element(s) belong to a single processing unit.

Second, one single strongly evoked element is too small a number for a processing unit. Pronouns in particular are of relatively high frequency (although

they are less frequent than zero pronouns) and the referent is assumed to have been evoked both in the speaker's and in the hearer's mind. Although "the magic number" is still controversial (including the skepticism about "expressing capacity limits of human cognition in terms of a number" (Oberauer 2007: p. 245), Cowan (2000; 2005) estimates that the magic number is around four in healthy young adults, whereas, in the original proposal in Miller (1956), the number is seven plus or minus two. Anyway, one element is too small in terms of this magic number.

Third, it is known that, historically, unstressed pronouns can turn into clitics, then into affixes (Givón 1976). Japanese pronouns such as *are* and *sore* are not exceptions; the *r* in *are* and *sore* is sometimes reduced and uttered very quickly, which is highly likely to become a motivation for them to turn into clitics in the future. Moreover, these pronouns often do not seem to have a clear pitch peak any more. The original pitch accent of *kore*, *sore*, and *are* is LH (the accent type of *kore*, *sore*, and *are* is a flat type; i.e., they do not have accent nucleus). However, at least the pitch contours of the pronouns in Figure 6.9-6.11 are not LH any more.<sup>2</sup> The pronoun *are* in Figure 6.10 is completely low, and *sore-wa* in Figure 6.9 is HL, whose first pitch I believe is high because the pronoun appears utterance-initially. When such clitic pronouns start to phonologically depend on other words, it becomes harder to argue that a single clitic corresponds to a single processing unit.

From the observations above, I propose that IUs with a single anaphoric element do not form a single processing unit; rather, it is more appropriate to integrate them into the following IU and regard the whole chunk as a unit of processing. How to decide whether an IU should be integrated into the following IU or not is left for future research.

### **6.2.1.4 Elements with topic markers in clausal IUs**

I have claimed that evoked topics tend to be uttered in phrasal IUs, while strongly evoked topics tend to be uttered in clausal IUs. This section discusses cases where lexical NPs coded by topic markers are produced in clausal IUs for several reasons.

First, contrasted elements coded by topic markers are typically uttered in a clausal IU; the pitch range of contrasted elements with the topic marker *wa* is

<sup>2</sup>This breaks one of the pitch accent principles of Japanese discussed in §2.4.1, which states that the pitch of the first and the second mora within a word must be different. I claim that this is one of the motivations for pronouns to appear after the predicate. See also §5.3.2.1 for discussion.

### 6.2 IU and IS unit: corpus study

Figure 6.9: Pitch contour of (10)

Figure 6.10: Pitch contour of (11)

Figure 6.11: Pitch contour of (12)

Figure 6.12: Pitch contour of a in (13)

Figure 6.13: Pitch contour of b in (13)

### 6.2 IU and IS unit: corpus study

Figure 6.14: Pitch contour of (14)

larger than that of the predicate. In (13), for example, where the speaker is talking about his life with his dog in Germany, *ti-nomi-go* 'infant' and *inu* 'dog' are contrasted.


As shown in Figures 6.12 and 6.13, the pitch range of the contrasted elements coded by the topic marker *wa* is larger than that of the predicates.

In a similar vein, in (14), *siken* 'exam' is implicitly contrasted with *mensetsu* 'interview'. Although the speaker did not do well in the exam, she had a fun time in the interview and she successfully passed the admission.


In this case, as shown in Figure 6.14, *siken* 'exam' is uttered in a wider pitch range than the predicate.

Also, when the clause is in a special status and is uttered faster, elements coded by topic markers are typically uttered in clausal IUs. For example, inserted clauses are uttered faster relative to other utterances and their pitch is lower than the surrounding utterances. In (15), where the speaker explains Everest treks and

Figure 6.15: Pitch contour of c in (15)

Figure 6.16: Pitch contour of a in (16)

Figure 6.17: Pitch contour of (17)

6.2 IU and IS unit: corpus study

which course she took, she inserts the clause describing the geometry of the Himalayas in (15-c). This clause contains an element coded by a topic marker, i.e., *himaraya-wa* 'Himalaya-*wa*', which is uttered in a clausal IU.

	- b. eberesuto-kaidoo-to Everest-trail-quot yob-areru call-pass Ş masani exactly Ş
	- c. ee fl **himaraya-wa** Himalaya-*wa* yokoni horizontally nagai-n-desu-keredomo long-nmlz-cop.plt-though Ş
	- d. ee fl sono that Ş ee fl higasi-gawa-ni east-side-dat ataru correspond Ş
	- e. eberesuto-o Everest-*o* Ş nn fl -ni -dat mukat-te face-and iku go Ş ruuto-desu route-cop.plt 'The course I took for trekking is called the Everest Trail, which exactly, **uh the Himalayas are long horizontally**, uh on the east side is Everest and we walked toward the Everest.' (S01F0151: 89.71-105.25)

As shown in Figure 6.15, the F<sup>0</sup> peak of *himaraya-wa* 'Himalaya-*wa*' is higher than that of the following predicate; therefore there is no IU boundary between the noun and the predicate.<sup>3</sup> In a similar way, in (16), where the speaker talks about her childhood dream, she comments on her dream in the inserted clause (16-a).

	- b. because I liked beautiful flowers,
	- c. (my dream was to be a) florist. (S01F0038: 53.90-58.93)

Figure 6.16 shows the pitch contour of (16-a). As in the figure, the F<sup>0</sup> peak of the topic phrase *kore-wa* 'this-*wa*' is higher than that of the predicate. Therefore, there is no IU boundary after *kore-wa*.

<sup>3</sup> In (15), pitch range difference cannot be used to determine the IU boundary because the F<sup>0</sup> of the phrase *himaraya-wa* is always high and hence one cannot meaningfully measure the pitch range. In this case, the IU boundary is identified after the phrase in question if the F<sup>0</sup> peak of the phrase is lower than that of the following phrase. In (15), the F<sup>0</sup> peak of *himaraya-wa* is higher than that of the predicate. Therefore, the IU boundary is not identified after the phrase *himaraya-wa* (see Igarashi et al. 2006: p. 420 ff.).

Another type of topic-coded element uttered in an clausal IU is embedded in a noun-modifier clause or quotation clause. For example, in (17-a), *piano-wa* 'piano-*wa*' is embedded in a quotation clause; the clause is the content of what the speaker thought.


As indicated in Figure 6.17, which shows the pitch contour of (17-a), the F<sup>0</sup> peak of the topic phrase *piano-wa* is higher than that of the predicate and the whole clause is interpreted as a single IU.

### **6.2.2 Foci tend to be uttered in clausal IUs**

### **6.2.2.1** *Ga***-coded S and** *o***-coded P that appear in clausal IUs**


Table 6.4: Intonation unit vs. particles

Table 6.3 and Figure 6.3, repeated here as Table 6.4 and Figure 6.18, indicates that *ga*- and *o*-coded elements are more likely to appear in clausal IUs than those coded by topic markers. In terms of grammatical function, it turned out that especially Ss are more likely to be uttered in clausal IUs than As, as shown in Table 6.5 and Figure 6.19, which show the distribution of grammatical function in terms of intonation unit regardless of whether elements are coded by topic markers or case markers. Since *ga* and *o* code focus and S and P also correlate

Figure 6.18: Intonation unit vs. particles


with focus, it is reasonable to conclude that focus in general tends to appear in clausal IUs.

(18-b) is an example of S in a clausal IU. The element *o-hanasi-ga* 'plt-speech*ga*' is uttered in a clausal IU.

(18) a. our way of collecting debt might be problematic,

b. oo fl mina-san everyone-hon Ş zisyuku control suru-yooni-to do-imp-quot iu say Ş **o-hanasi-ga** plt-speech-*ga* de-masi-te come.out-plt-and Ş 'somebody proposed that employees should improve the method.' (S00M0221: 503.23-511.02)

Figure 6.19: Intonation unit vs. grammatical function

As shown in Figure 6.20, there is no pitch reset in the first mora of the predicate. Also, the pitch range of *o-hanasi-ga* 'plt-speech-*ga*' is larger than that of the predicate *de-masi-te* 'come.out-plt-and', which indicates that the S element and the predicate are uttered in a single IU.

In a similar vein, in (19), whose pitch contour is shown in Figure 6.21, the S element *sikitari-ga* 'tradition-*ga*' and the predicate are uttered in a single IU; there is no pitch reset observed in the first mora of the predicate.

(19) hizyooni very kanasii sad Ş anoo fl Ş **sikitari-ga** tradition-*ga* ari-masi-te exist-plt-and Ş 'There was a very sad tradition...' (S05M1236: 297.99-305.33)

(20-a) is an example of a P uttered in a clausal IU.

	-

As shown in Figure 6.22, since there is no pitch reset in the first mora of the predicate *tori-tai* 'take-want' and the pitch range of the element *puro-raisensu-o* 6.2 IU and IS unit: corpus study

Ş

'professional-license-*o*' is larger than that of the predicate, there is no IU boundary after the element *puro-raisensu-o* 'professional-license-*o*'.

Similarly, in (21-c), whose pitch contour is shown in Figure 6.23, the clause is uttered in a single IU. The pitch range of the element *syuzyutu-o* 'operation-*o*' is larger than that of the predicate.

	- b. many times (I) stayed in the hospital and
	- c. **syuzyutu-o** operation-*o* uke-tei-tari receive-prog-hdg Ş si-tei-ta-node do-prog-past-because 'received operations, so'
	- d. when I die,
	- e. (I) was thinking that (I) would probably die in an accident or from a disease. (S02F0100: 387.22-399.08)

Figure 6.20: Pitch contour of (18)

Figure 6.21: Pitch contour of (19)

Figure 6.22: Pitch contour of a in (20)

Figure 6.23: Pitch contour of c in (21)

**6.2.2.2** *Ga***-coded S and** *o***-coded P that appear in phrasal IUs**

Figure 6.24: Pitch contour of a in (22)

Here, I discuss *ga*-coded S and *o*-coded P that appear in phrasal IUs. Although they are more likely to be uttered in clausal IUs than those coded by topic markers, they are still often uttered in phrasal IUs, as shown in Table 6.5 and Figure 6.19. I point out two types of focal elements uttered in phrasal IUs.

### 6.2 IU and IS unit: corpus study

Figure 6.25: Pitch contour of a in (23)

Figure 6.26: Pitch contour of b in (23)

The first type is that of strongly evoked elements that are uttered in a lower pitch than their predicate and that are therefore followed by an IU boundary. These are uttered in phrasal IUs for the same reason as discussed for pronouns in §6.2.1.3. For example, in (22), whose pitch contour is shown in Figure 6.24, *piano* is strongly evoked and is uttered in a lower pitch than its predicate. Therefore, the F<sup>0</sup> range of *piano* is smaller than that of the following predicate and there is an IU boundary between the element *piano* and the predicate. *Piano* is considered to be strongly evoked because the speaker mentions it repeatedly throughout her talk.


Similarly, in (23-a), whose pitch contour is shown in Figure 6.25, *kusuri* 'medicine' is strongly evoked and uttered in a lower pitch than the predicate *tamesu* 'try'. *Kusuri* 'medicine' is strongly evoked because it has also been mentioned immediately before (23-a), as indicated by *sono* 'that'.

	- b. de then Ş tasikani certainly sono that **kusuri-o** medicine-*o* nuru-to put-cond Ş 'As the doctor said, when (I) put on the medicine,' c. (my disease) becomes a little bit better... (S02F0100:

155.34-159.32)

However, in (23-b), which immediately follows (23-a), the F<sup>0</sup> peak of *kusuri* 'medicine' is higher than that of the predicate *nuru* 'put on', as shown in Figure 6.26. This contrasts with what I have claimed so far. I believe that the F<sup>0</sup> peak of *kusuri* in (23-b) is higher than that of the predicate because it appears sentence-initially. Japanese is a clause-chaining language, which combines multiple clauses to form a thematic unit (Longacre 1985; Martin 1992; Givón 2001). The F<sup>0</sup> of sentenceinitial clauses is the highest and it declines as the sentence goes on (Koiso & Ishimoto 2012; Ishimoto & Koiso 2012; 2013). Therefore, the elements in sentenceinitial position are the highest among other elements. As I have argued in §6.2.1.3, a pair consisting of a strongly evoked element and the following phrase should be considered to form a single processing unit. As in Figure 6.24–6.26, there is no pause or vowel lengthening between the anaphoric element and the predicate, which typically appear IU-finally. This supports the notion that they should be integrated into a single unit at a level higher than intonation unit.

The second type of focal elements uttered in phrasal IUs is not as clear as the first one. I am not sure whether examples of the second type share the same characteristics. Rather, it is likely that they are still heterogeneous. Here I try to capture some of their characteristics. In some examples of the second type, the element is non-anaphoric and the F<sup>0</sup> is high; however, the F<sup>0</sup> of the predicate is also high for some reason. Examples of this kind are shown in (24) and (25). In (24), *kusa* 'grass' is non-anaphoric and is uttered with prominence, but there is a pitch reset before the predicate, which has its own F<sup>0</sup> peak as in Figure 6.27.

(24) a. **kusa-ga** grass-*ga* Ş hae-te grow-and ki-ta come-past Ş tokoro-ni place-dat Ş 'The place where grasses grow up'

6.2 IU and IS unit: corpus study

b. some people build houses... (S00F0014: 276.80-279.30)

In (25), in a similar vein, there is a pitch reset before the predicate; the nonanaphoric element *tatoe* 'metaphor' and the predicate *warui* 'bad' have their own F<sup>0</sup> peak as in Figure 6.28.

	- b. it's kind of a kamikaze-like idea. (S00M0199: 360.76-365.14)

Figure 6.27: Pitch contour of a in (24)

Figure 6.28: Pitch contour of a in (25)

In other examples of the second type, non-anaphoric elements are uttered in a low pitch without prominence as though they were strongly evoked. In example (26), the brand-new element *nyuukinbi* 'the deadline of repayment' is produced in a low pitch, against our prediction as shown in Figure 6.29.

	- b. oo fl Ş **nyuukinbi-ga** deadline-*ga* Ş sugi-te pass-and ori-masu-toiu prog.plt-plt-quot koto-de thing-cop Ş

' "The deadline for repayment has passed" something like that...' (S00M0221: 220.24-225.28)

In this case, however, *nyuukinbi* 'the deadline of repayment' can be also regarded to be inferable through the previous context, because the speaker has been talking about the people who did not return money, although the speaker has not specifically mentioned *nyuukinbi* 'the deadline'. However, it is more natural for inferable elements to acquire their own pitch peak.

Moreover, there are also cases where perfectly brand-new elements are uttered in a low pitch as if they were strongly evoked. In (27), neither the element *kyoomi* 'interest' nor any related concept has been mentioned in the previous discourse, yet it is still uttered in a low pitch as in Figure 6.30.


I do not have a clear explanation for why this happens. Intuitively, the F<sup>0</sup> peak can be either on the element *kyoomi* 'interest' or on the predicate *moti* 'have' and the nuance does not change. However, it is unnatural if both the element and the predicate have their own F<sup>0</sup> peak. Typically there is no pause or vowel lengthening between the element and the predicate in this type of example. Therefore, I tentatively conclude that uttering both the element and the predicate in a coherent pitch contour is important and I leave open the question of which one should have the F<sup>0</sup> peak. I am inclined to think that the element and the predicate form a single processing unit.

### **6.2.3 Summary of the corpus study**

This section argued that evoked, inferable, and declining topics tend to be produced in phrasal IUs, separately from the IU with the predicate; and that strongly evoked topics are typically produced in clausal IUs together with the IU with the predicate; whereas foci tend to be produced in clausal IUs, although there are explainable exceptions.

However, as discussed in Chapter 5, topics tend to appear clause-initially and foci tend to appear right before the predicate. An element is more likely to be

### 6.2 IU and IS unit: corpus study

Figure 6.29: Pitch contour of b in (26)

Figure 6.30: Pitch contour of a in (27)

Figure 6.31: Intonation unit vs. word order

uttered in a clausal IU if it is closer to the predicate, which implies that foci are more likely to be uttered in clausal IUs. Therefore, it is not entirely clear whether information structure really affects the difference between phrasal and clausal IUs independently of word order. As an example, let us assume that (28) is a possible utterance that the speaker bears in his/her mind. "(Ş)" indicates a potential IU boundary. For simplicity, let us assume that only one out of the three potential IU boundaries is realized in this utterance.

$$\text{(28)}\qquad \text{A (||}\_1\text{) B (||}\_2\text{) C (||}\_3\text{) predicate}$$

If the speaker wants to put an IU boundary in Ş<sup>1</sup> , the IU which includes A is a phrasal IU, whereas the IU which includes B and C is a clausal IU as schematized in (29).

$$(29) \quad \boxed{\mathbf{A}} \parallel \mathbf{l}\_1 \boxed{\mathbf{B}} \mathbf{C} \text{ Predicted} \boxed{\mathbf{c}}$$

On the other hand, if the speaker wants to put the IU boundary in Ş<sup>2</sup> , now the IU which includes A and B is a phrasal IU, whereas the IU which includes C is a clausal IU. This is schematized in (30).

### (30) A B Ş<sup>2</sup> C Predicate

This indicates that even though the speaker does not want to put the IU boundary in Ş<sup>1</sup> , A is uttered in a phrasal IU because of Ş<sup>2</sup> and Ş<sup>3</sup> ; A is more likely to be uttered in a phrasal IU than B and C because it is uttered earlier. Similarly, B is more likely to be uttered in a phrasal IU than C. The effects of word order should not be ignored in the distinction between phrasal and clausal IUs. In fact, as Figure 6.31 shows, earlier elements are more likely to be produced in phrasal IUs than later elements.

In the next section, I discuss an experiment controlling for word order, and show that topics tend to be followed by an IU boundary, while foci are not.

### **6.3 Intonation unit and unit of information structure: experimental study**

In the previous sections, I investigated a corpus of spoken Japanese. In this section, I will show that my argument so far is also supported by a production experiment keeping word order constant.

6.3 IU and IS unit: experimental study

### **6.3.1 Method**

This section gives an overview of the experimental methods. First, I explain how stimuli were made (§6.3.1.1), then I go over the experiment procedure (§6.3.1.2). Finally, I explain how the recordings acquired were annotated (§6.3.1.3).

### **6.3.1.1 Stimuli**

First, I made a list of three-mora nouns without accent nucleus (the pitch formation is expected to be LHH). I chose basic words that are used in everyday life, such as *sakura* 'cherry blossom' and *koinu* 'puppy'. I used an electronic dictionary of Japanese called *UniDic* to search words (Den et al. 2002; 2007).<sup>4</sup> I chose words of this accent type to exclude the potential effect of the accent of these words on the following words. Second, I collected a list of verbs starting with low pitch. The second mora of the verbs should be high because the first and the second mora of a word should be distinct as discussed in §2.4.1. I chose these words to see difference in F<sup>0</sup> between the first and the second morae. Third, I made 14 pairs of a noun and a verb of high collocation using *Case Frame* (Kawahara & Kurohashi 2006a,b).<sup>5</sup> 7 pairs are subject-verb, and the remaining 7 pairs are object-verb, using the same noun. The stimuli can be schematized as in (31), where N indicates noun and V indicates verb.

### (31) [LHH] [LH...]

Finally, I made two contexts for each pair; in one context, the noun is interpreted as topic, and in the other context, the noun and the verb as a whole are interpreted as focus.

Examples of the two kinds of contexts and of the noun-verb pairs are shown in (32) and (33). The target sentence is *koinu yuzut-ta* '(I/we) gave (a/the) puppy'. In (32), where the noun is intended to be interpreted as topic and the verb as focus, the referent of the noun *koinu* 'puppy' has already been shared between the speaker and the hearer. Only the verb *yuzu-ta* 'gave' is news to the hearer. In all the examples, the context forces the speakers to assume topics to be unused.

(32) **Predicate-focus context**: Yesterday the speaker and his/her friend found an abandoned puppy on the street. The speaker brought it to his/her home. Today, the speaker tells the friend what happened to the puppy.

<sup>4</sup>http://sourceforge.jp/projects/unidic/

<sup>5</sup>http://reed.kuee.kyoto-u.ac.jp/cf-search/

sooieba by.the.way [**koinu**] puppy [**yuzut-ta**] -yo give-past-fp 'By the way, (I) gave the puppy (to somebody).'

In (33), on the other hand, where both the noun and the verb are intended to be interpreted as focus, the referent of the noun *koinu* 'puppy' has not been shared. Not only the verb 'gave', but also 'a puppy' is brand-new to the hearer.

(33) **All-focus context**: The speaker and his/her friend are working at an animal shelter. The friend was absent yesterday and wants to know what happened yesterday.

> kinoo-wa yesterday-*wa* [**koinu** puppy **yuzut-ta**] -yo give-past-fp 'Yesterday (we) gave puppies.'

After I made the stimuli, I randomized their order so that the same target sentences (with predicate-focus and all-focus contexts) do not appear adjacent to each other.

### **6.3.1.2 Experimental procedure**

I asked seven native speakers of standard Japanese to read aloud the stimuli. All participants grew up in Tokyo or near Tokyo (e.g. Saitama), where standard Japanese is spoken. All of them have lived for more than a year outside of the areas where standard Japanese is not spoken. Four of the participants are male, and three are female. I recorded their production using EDIROL (R09-HR) and the internal microphone.

### **6.3.1.3 Coding process**

After the recording, I coded their speech using Praat.<sup>6</sup> First, I divided each target sentence into morae, then I divided each mora into a consonant (if present) and a vowel. Second, I measured the F<sup>0</sup> of the midpoint of the vowels with a Praat script.

### **6.3.2 Results**

Figures 6.32–6.35 show the F<sup>0</sup> of the vowels of each target sentence based on information structure. The graphs of Speaker 5–7 are omitted. In the x-axis, n1

<sup>6</sup>http://www.fon.hum.uva.nl/praat/

6.3 IU and IS unit: experimental study

indicates the first mora of the noun, n2 indicates the second mora, and v1 indicates the first mora of the verb, and so on.

In some cases, there are less than 14 data points. This is because some vowels are devoiced. In standard Japanese, high vowels are often devoiced between two voiceless consonants such as *kusuri* [kW ˚ sWRi] 'medicine'. However, this is not always the case. Therefore, the number of data points varies depending on the speaker.

The red lines indicate the plot of the predicate-focus context, while the blue lines indicate the plot of the all-focus context. The error bars indicate the standard variations of F<sup>0</sup> . Although the error bars are too large, it is clear that there is a pitch reset in v1, i.e., the first mora of the verb, and that the pitch rises again in v2, i.e., the second mora of the verb.

Figure 6.32: F<sup>0</sup> of vowels (Speaker 1)

A logistic regression analysis supports this impression. Table 6.6 and 6.7 show the results of the regression analysis. The dependent value is the F<sup>0</sup> difference between the adjacent morae of each utterance; in Table 6.6, the dependent value is the F<sup>0</sup> difference between n3 and v1, while, in Table 6.7, it is the difference between v1 and v2. The independent values (predictors) are information structure (the distinction between the predicate- vs. all-focus contexts), grammatical relation (the distinction between the subject and the object), in addition to speakers and items as random effects.

Table 6.6 shows that the predicate-focus context significantly contributes to the F<sup>0</sup> difference between n3 and v1. The fact that the estimate is minus indicates

Figure 6.33: F<sup>0</sup> of vowels (Speaker 2)

Figure 6.34: F<sup>0</sup> of vowels (Speaker 3)

### 6.3 IU and IS unit: experimental study

Figure 6.35: F<sup>0</sup> of vowels (Speaker 4)

that the F<sup>0</sup> value of v1 is lower than that of n3, which leads to the conclusion that there is a pitch reset in v1. Table 6.7 shows that, on the other hand, both the predicate-focus structure as well as the subject significantly contribute to the F<sup>0</sup> difference between v1 and v2. The estimate is plus this time, which indicates that there is a pitch rising from v1 to v2. <sup>7</sup> To summarize, there is a pitch reset in the first mora of the verb in the predicate-focus context, where the noun is a topic, while the pitch reset is not observed in the all-focus context.

Table 6.6: Results of logistic regression analysis (v1-n3)


Examples of the pitch contour of actual productions are shown in Figure 6.36 and 6.37. In Figure 6.36, where one of the participants of the experiment uttered (32), there is a pitch reset in the first mora of the verb *yuzut-ta* 'gave', while in Figure 6.37, where the same participant uttered (33), there is no pitch reset.

<sup>7</sup> I do not have an explanation why the subject also contributes to the pitch difference of verbs. Further investigation is definitely necessary.

Coefficients Estimate p-value Information structure (predicate-focus) 8.5667 0.0149 \* Grammatical relation (subject) 8.2356 0.0221 \*

Table 6.7: Results of logistic regression analysis (v2-v1)

$$\{0 \le \text{"} \text{"} \text{"} \text{"} \text{"} \text{"} 0.01 \le \text{"} \text{"} \text{"} \text{"} \text{"} 0.01 \le \text{"} \text{"} \text{"} \text{"} \text{"} 0.05 \text{"} \text{"} \le 0.1 \le \text{"} \text{"} \}$$

Figure 6.36: Pitch contour of (32)

Figure 6.37: Pitch contour of (33)

I also measured the vowel length of the last mora of the nouns. However, neither information structure nor grammatical relation significantly contributes to the vowel length. In addition, I conducted a regression analysis using the pitchrange difference between the noun and the verb as a dependent variable. Again, however, neither information structure nor grammatical relation significantly contribute to the pitch-range difference.

### **6.3.3 Summary of the experimental study**

In this section, I discussed the results of the production experiment and concluded that topic elements are produced intonationally separate from the focus predicate, namely, in phrasal IUs; while elements which consist of focus with the predicate are produced intonationally unified with the predicate, namely, in clausal IUs.

### **6.4 Discussion**

This section discusses motivations for intonation units.

### **6.4.1 Principles of intonation units, information structure, and activation cost**

I propose two closely related motivations for evoked, inferable, declining, and unused topics to be found in phrasal IUs and for foci to be found in clausal IUs. First, uttering an evoked, inferable, declining, or unused topic – typically a noun followed by a topic particle – is iconic and easy to process for both the speaker and the hearer. The same applies to uttering a focus – typically the predicate and optionally a noun – in another IU. I call this the iconic principle of intonation unit and information structure (34).

(34) **The iconic principle of intonation unit and information structure**: In spoken language, an IU tends to correspond to a unit of information structure.

This motivates the tendency for evoked topics to be uttered in a phrasal IU and for foci to be uttered in a clausal IU.

Second, strongly evoked elements are proposed to be produced in a coherent IU with the predicate, namely, in a clausal IU; elements with low activation cost are not produced by themselves. Based on this observation, I propose the principle of IU and activation cost.

(35) **The principle of intonation unit and activation cost**: all substantive IUs have similar activation costs; there are few IUs with only a strongly evoked element or with too many new elements.

This is inspired by, but also elaborates on, the "one new idea at a time" constraint in Chafe (1987; 1994). Chafe (1987; 1994), and Matsumoto (2003), who follows Chafe, considers that this "one idea" corresponds to a grammatical category such as subject, object, or verb. Chafe (1994: p. 110 ff.), for example, discusses IUs consisting of an object and a verb as exceptional. He argues that, in such IUs, there are special reasons for an object and a verb to be produced in an IU; verbs have been already evoked, the IU includes a low-content verb (such as "*have, get, give, do, make, take, use* and *say*", p. 111), or the object and the verb constitute a lexicalized phrase (such as *wash dishes*). However, in my corpus, IUs with an object and a verb (or a subject and a verb) to which these conditions to not apply are not rare. For example, *toti-o uba-u* 'deprive (somebody) of land' is produced in a single IU. However, the expression is not frequently used in everyday life and the predicate *uba-u* 'deprive' is mentioned for the first time in this monologue. The verb *uba-u* 'deprive' is not low-content, either.


Similarly, in (37-b), *i-nai kata-ga nana-wari* 'those who are absent consist of 70%' is neither conventionalized nor evoked, but it is still produced in a single IU.

(37) a. Those who do not pay back their debt consist of 30 %. b. sorekara then ‖ **i-nai** exist-neg **kata-ga** person-*ga* nana-wari-to seven-ratio-quot ‖ 'And, those who are absent consist of 70%.' (S00M0221: 348.22-356.07)

I argue that the NP and the verb are produced in a clausal IU because they consist of a unit of information structure: focus. At the same time, they form a syntactic constituent: VP. A unit of focus can contain several clauses through clausechaining, but they are usually not realized as a single IU, but as several IUs because of processing limitations, which is captured by the principle in (35).

The principles in (34) and (35) compete with each other and both shape the actual IUs.

6.4 Discussion

### **6.4.2 Principle of the separation of reference and role**

I argue that intonation units play an important role in clause-chaining. As discussed in Chapter 5, uttering persistent elements clause-initially (with topic markers) is especially useful in clause-chaining languages; this announces which element becomes zero in the following utterance. These clause-initial elements are often uttered in phrasal IUs rather than clausal IUs. For example, in (38), *eberestokaidoo* 'Everest trail' appears clause-initially, followed by an IU boundary, and is mentioned three times in the following clauses as indicated by Ø.<sup>8</sup> This big chunk of clauses in (38) as a whole consists of a sentence and each clause is combined through clause-chaining.

(38) a. **kono** this **eberesuto-kaidoo-toiuno-wa** Everest-trail-quot-*wa* Ş 'This Everest Trail is' b. tibetto-to Tibet-com nepaaru-no Nepal-gen Ş kooeki-ro-ni-mo trade-road-dat-also nat-te become-and ori-masi-teŞ prog-plt-and 'also used for trading between Tibet and Nepal.' c. ma fl zissai-wa actual-*wa* nihon-de Japan-loc iu-to say-cond Ş 'Say, in Japan for example,' d. **Ø** Ø takao-san-mitaina Takao-mountain-like Ş yama-miti-nan-desu-keredomo mountain-road-nmlz-cop.plt-though Ş 'it's like a road in Mt. Takao or something.' e. genti-no local-gen hito∼bito-nitotte-wa person∼pl-for-*wa* Ş ee fl Ş **Ø** Ø tuusyoo-ro-to trade-road-quot Ş iu-yoona say-like f. insyoo-no impression-gen **Ø** Ø miti-desi-ta road-cop.plt-past Ş 'it was like a trading road for local people.' (S01F0151: 105.73-120.14)

To schematize, utterances like (39) are frequently observed.

(39) a. Topic Ş

<sup>8</sup>Ø is assumed to appear right before the predicate for the purposes of presentation. However, this assumption does not affect the analysis here.


First, the topic is uttered clause-initially (often coded by topic markers) in a phrasal IU. Then the explanation about the topic follows it. In other words, expressions like (39-a) followed by an IU boundary establish topics to be mentioned in the following discourse.

This type of example is small in number per monologue because there is only a few topics introduced in each monologue. This blurs the pattern shown in (39) in a simple count of raw numbers like the one shown in Table 6.2 and Figure 6.2.

I argue that the tendency schematized in (39) is a realization of the principle of the separation of reference and role proposed by Lambrecht (1994). Lambrecht (1994: 184-185) argues: "[t]he non-canonical configurations thus allow speakers to separate the referring function of noun phrases from the relational role their denotata play as arguments in a proposition. [...] I will call the grammatical principle whereby the lexical representation of a topic referent takes place separately from the designation of the referent's role as an argument in a proposition the principle of the separation of reference and role (PSRR) for topic expressions. The communicative motivation of this principle can be captured in the form of a simple pragmatic maxim: "Do not introduce a referent and talk about it in the same clause". In Japanese, the PSRR is reflected by the fact that topic elements are also separated intonationally from the clause.

### **6.5 Summary**

### **6.5.1 Summary of this chapter**

This chapter analyzed intonation units in Japanese in terms of whether an NP is intonationally separated from the predicate or not. It argued that evoked, inferable, declining, and unused topics tend to be separated intonationally from the predicate, while strongly evoked topics tend to be integrated into the predicate. On the other hand, focus elements tend to be integrated into the predicate to form a unit of focus with the predicate. I proposed three inter-related principles that are at work in determining intonation units in Japanese.

### **6.5.2 Remaining issues**

In this chapter, I proposed to modify the definition of intonation units. Further studies are needed to investigate cognitively valid definitions of intonation unit. Furthermore, it is also necessary to develop a methodology to find a unit of processing independent of intonation to avoid circularity.

# **7 Discussion: Multi-dimensionality of linguistic forms**

### **7.1 Summary of findings**

The findings so far are summarized in Table 7.1 and 7.2.

Table 7.1: Summary of topic


Table 7.2: Summary of (broad) focus


Overall, I showed that correlated but distinct features affect particle choice, word order, and intonation in spoken Japanese. The features proposed are summarized in (2) in Chapter 3, which is repeated here as (1) for convenience.

### 7 Discussion


In Chapter 4, I concentrated on particles. Topic markers such as *toiuno-wa*, *wa*, and *kedo/ga* are sensitive to the assumed status of the referent according to the given-new taxonomy. All topic markers code elements that are presupposed to be shared between the speaker and the hearer and cannot be negated in a normal way. What this amounts to is that topic markers are sensitive to a and b in (1). The marker *toiuno-wa* codes elements referring to an entity with an evoked status in the hearer's mind. The marker *wa* codes elements referring to an entity with an inferable status, in addition to marking elements that can also be coded by *toiuno-wa*. The marker *kedo/ga* preceded by the copula *da* or *desu* codes elements referring to an entity that is declining or unused in the assumed hearer's mind. Topic markers are optional except for contrastive topics. In formal speech style, topic markers tend to appear. In addition to whether the referent in question is evoked or not, I also showed that topic markers are partially sensitive to grammatical function (f in (1)): when the clause has two evoked arguments, A and P, A is more likely to be coded by a topic marker (in this case, *wa*) than P.

Case markers are, on the other hand, sensitive to whether the referent is (part of) an assertion or not (a in (1)), in addition to grammatical functions (f in (1)). A, agent S, and optionally patient S are coded by *ga*, whereas patient S and P tend to be coded by *Ø*. A, S, and P in the argument focus or narrow focus environment are coded by explicit markers. I (and the previous literature) also suggested the possibility that *ga* and *o* are sensitive to animacy (e in (1)).

In Chapter 5, I focused on word order. I showed that shared elements, which correlate with topics, tend to appear clause-initially irrespective of their status in the given-new taxonomy. Strongly evoked elements can appear post-predicatively, especially in conversation. Post-predicate elements are sensitive to the given-new taxonomy (b in (1)), while clause-initial elements are sensitive to identifiability. On the other hand, foci tend to appear pre-predicatively (i.e., immediately before the predicate). Pre-predicate elements tend to refer to non-shared entities, in contrast to clause-initial topics. Word order is also sensitive to grammatical function (f in (1)), as classically observed. The referent of clause-initial

### 7.1 Summary of findings

elements is referred to by zero pronouns in the following discourse, while the referent of pre-predicate elements tends to re-appear as a full NP, even repeatedly.

In terms of word order, I proposed that three inter-related principles – repeated here as (2), (3), and (4) – are at work when determining word order in spoken Japanese. Principles (2) and (4) predict that topics appear clause-initially, while principle (3) and the assumption that Japanese is a verb-final language predict that foci appear pre-predicatively.


Perhaps, there is no principle that predicts the order of strongly evoked elements because they are not necessary; since the referent is strongly evoked, the hearer is assumed to be able to identify it. They are produced for intonational or interactional reasons, as has been discussed in 5.3.2.

In Chapter 6, I discussed intonation. I showed that evoked, inferable, declining, and unused topics tend to be produced in an intonation unit separate from the predicate, while strongly evoked topics tend to be produced in an intonation unit together with the predicate. On the other hand, broad focus tends to appear in an intonation unit with the predicate to form a unit of predicate focus structure. I proposed two principles determining intonation units in Japanese, repeated here as (5) and (6). Principle (5) predicts that a topic appears in an intonation contour and that a focus appears in another intonation contour, whereas principle (6) predicts that strongly evoked topics are produced in the same IU as the focus.


### 7 Discussion

To be more precise, these principles predict that when the activation cost of a topic is high, it is separated intonationally from the focus predicate, as in (7-a); whereas when its activation cost is low, it is produced with the focus predicate, as in (7-b-c). Each box corresponds to an IU.


### **7.2 Competing motivations**

As summarized above, individual features such as topic or focus do not determine particle choice, word order, or intonation; rather, single linguistic expressions are influenced by multiple features. This is not a rare phenomenon: it is frequently observed in languages and it is a source of language change. Comrie (1979) called this variability "seepage". As has been discussed in §4.3.1.3, *ko* in Hindi codes definite or animate direct objects – here there is also no individual feature determining the use of this particle. Citing Poppe (1970), he discusses another example from Mongolian. According to Poppe, the accusative suffix *-iig* only attaches to certain kinds of direct object, namely human direct objects, as exemplified in (8).


On the other hand, non-human direct objects are optionally followed by the suffix, as in (9). In this case, definiteness plays an important role. To complicate things, the suffix also attaches to indefinite direct objects when they are apart from the verb.

	- b. zurag-iig picture-do Choidog Choidog zurav painted

7.2 Competing motivations

'Choidog painted the picture. (As for the picture, it was Choidog that painted it.)' (Comrie 1979: 19)

The distinction between the so-called accusative marker *o* and zero particles in Japanese is similar to the use (or non-use) of this suffix *-iig* in Mongolian. The choice between *o* and zero particles is reported to be determined by definiteness, animacy, and word order: definite or animate objects are more likely to be coded by *o* than by zero particles (Minashima 2001; Fry 2001; Kurumada & Jaeger 2013; 2015). Also, according to Tsutsui (1984); Matsuda (1996); and Fry (2001), verbadjacent objects are more likely to be zero-coded (and hence less likely to be *o*-coded), while non-verb-adjacent objects are more likely to be coded by *o*, although the distinction is subtle.

Du Bois (1985) argues that the multi-dimensionality of a linguistic expression is based on "competing motivations". A relevant example of competing motivations provided by Du Bois is the distinction between ergative-absolutive and nominative-accusative languages.

The reason that not all languages are ergative – i.e. that some languages choose the 'option' of categorizing S with A rather than with O [P in terms of this study] – is that there is another motivation which competes for the same limited good, the structuring of the person-number-role paradigm. [...] S and A are united by their tendency to code referents which are human, (relatively) agentive, and maintained as topics over significant stretches of discourse ('thematic'). Thus, a discourse pressure to roughly mark topic/agent motivates nominative-accusative morphology, while a discourse pressure to roughly mark new information motivates ergative-absolutive morphology. These two pressures may be seen as competing to overlay a secondary function on the existing A/S/O base (though this formulation is of course somewhat oversimplified). [...] Thus the answer to the question as to why not all languages are ergative is simply that, while there is a strong discourse pressure which motivates an absolutive category, there is an equally strong – possibly stronger – discourse pressure which motivates a nominative category. Both motivations cannot prevail in the competition for control of the linguistic substance of this paradigm. (Du Bois 1985: 354–355)

My study showed competing motivations that affect particle choice, word order, and intonation in spoken Japanese. For example, as has been discussed in §4.5.2 and Nakagawa (2013), case particles are sensitive to focushood and thus P and patient S are unmarked (zero-coded). On the other hand, topic markers are sensitive to topichood and thus A and agent S are unmarked in another dialect, Kansai Japanese.

### 7 Discussion

If it were single features (such as "topic" or "focus'') that determined word order and particle choice, it would be expected, for example, that all clause-initial elements should be coded by topic markers, since both clause-initial elements and those coded by topic markers would be topics. However, this is not the case, as shown in §5.2.1.1. Although clause-initial elements tend to be coded by topic markers, not all of them are. This is because, while both word order and topic coding are sensitive to topichood and focushood, they are sensitive to different features: clause-initial elements are sensitive to identifiability, whereas topic markers are sensitive to the activation status of the referent in question.

The claim of this study is an elaboration of the claim made by Li & Thompson (1976) that Japanese is a subject-prominent and topic-prominent language. In terms of this study, the claim is elaborated in the following way: Japanese is sensitive to various features related to topichood and focushood – such as presupposition vs. assertion (a in (1)) – and the status in the given-new taxonomy (b in (1)), in addition to grammatical function (f in (1)).

The theory of competing motivations and correlating features of topic and focus in (1) predicts that there are other types of languages, such as animacyprominent languages and specificity-prominent languages. As far as I am aware, according to the literature there are at least what I call animacy-prominent languages (Dahl & Fraurud 1996; Minkoff 2000; de Swart et al. 2007: *inter alia*). For example, in grammatical sentences in Mam-Maya, the subject is as animate as, or more animate than the object (Minkoff 2000). Another well-known example is Navajo (Athapaskan). In Navajo, the order of S and P can be either SP or PS. In the case of an SP order, the marker *yi* attaches before the verb; in the case of a PS order, the marker *bi* attaches to the verb (Hale 1972; Frischberg 1972). This is exemplified in (10). In (10-a), where the subject 'horse' precedes the object 'mule', the affix *yi* attaches to the verb. In (10-b), on the other hand, where the object precedes the subject, *bi* is used.


(Hale 1972: 300)

When the subject and the object are equally animate, as in (10), both *yi-* and *bi*constructions can be used. However, when the subject is more animate than the

7.3 Languages with hard constraints

object, only the *yi-*construction with SP order is grammatical; while when the object is more animate than the subject, only the *bi-* construction with PS order is grammatical. These languages can be called animate-prominent languages in the sense that animacy constrains word order or grammatical functions.

Finally, I point out that this kind of multivariate analysis is not compatible with theories like generative grammar. For example, Endo (2014), following Rizzi's cartography theory (e.g., Rizzi 1997; 2004), points out that "an information focus occurs immediate left to the verb" (p. 170).<sup>1</sup> This observation is compatible with that of Kuno (1978). In the following example (11-A), *hon* 'book' is a focus because it is the answer to the *wh*-question (11-Q). The focus appears immediately before the verb *kai-masi-ta* 'bought'.

(11) Q: What did you buy?

A: watasi-wa 1sg-top **hon-o** book-acc kai-masi-ta buy-plt-past 'I bought a book.' (Endo 2014: 170–171)

As we immediately notice, however, the focus *hon* 'book' is the object (P) of the sentence at the same time. In the cartography framework, it is not clear how to represent an element which is a focus and the object at the same time.

### **7.3 Languages with hard constraints**

This study showed a variety of statistical tendencies regarding particle choice, word order, and intonation in Japanese. Especially, in Chapter 5 and 6, I discussed the distinction between elements that appear close to the predicate (in terms of word order) and are glued to the predicate (in terms of intonation), and elements that appear separately from the predicate (in terms of both word order and intonation). In this section, I discuss other languages that have conventionalized the statistical tendency shown in this study. As Bresnan et al. (2001) state, "soft constraints mirror hard constrains"; namely, "[t]he same categorical phenomena which are attributed to hard grammatical constraints in some languages continue to show up as statistical preferences in other languages, motivating a grammatical model that can account for soft constraints" (p. 29). See also Givón (1979); Bybee & Hopper (2001).

In §7.3.1, I discuss languages that integrate some elements into the predicate. In §7.3.2, I focus on languages that separate some elements from the predicate.

<sup>1</sup>An information focus is "the answer to *wh*-questions and the target of negation" (ibid.), which is the same focus discussed in this study.

### 7 Discussion

### **7.3.1 Elements glued to the predicate**

There are two kinds of elements proposed in this study that are glued to the predicate: strongly evoked elements that are postposed and focus elements.

### **7.3.1.1 Affixation of pronouns**

First, I discuss languages where strongly evoked elements, especially pronouns, are glued to the predicate. As discussed in §5.3, strongly evoked elements in spoken Japanese can appear immediately after the predicate, in a single intonation contour with the predicate. This is a statistical tendency (i.e., a soft constraint) rather than a categorical phenomenon (i.e., a hard constraint), showing that strongly evoked elements tend to be glued to the predicate. I argue that in languages with hard constraints, this corresponds to so-called "grammatical agreement". In languages with grammatical agreement, an affix coreferential with the subject or the object typically attaches to the verb. As Givón (1976: 151) states, "[grammatical agreement and pronominalization] are fundamentally one and the same phenomenon, and [...] neither diachronically nor, most often, synchronically could one draw a demarcating line on any principled grounds." He argues that "subject grammatical agreement" arose from topic-shift constructions like (12-a), which are reanalyzed as "subject-verb agreement", as in (12-b).

(12) a. Topic shift The man, (topic) **he** (pronoun) came. (verb) b. Neutral (reanalyzed) The man (subject) **he**-came. (agreement)-(verb)

(Givón 1976: 155)

Givón argues that "[t]he morphological binding of the pronoun to the verb is an inevitable natural phenomenon, cliticization, having to do with the unstressed status of pronouns, their decreased information load and the subsequent loss of resistance to phonological attrition" (p. 155). The following are examples from Swahili (Bantu). In (13-a), the subject *m-toto* 'child (class 1)' has an agreement relationship with the verb prefix *a* 'he (class 1)'. According to Givón, the verb prefix *a* originates from a pronoun. Similarly, in (13-b), the subject *ki-kopo* 'cup (class 7)' agrees with *ki* 'it (class 7)'. The examples are glossed based on Contini-Morava (1994).

7.3 Languages with hard constraints

	- b. **ki**-kopo cl7-cup **ki**-li-vunjika 3sbj.cl7-past-break 'The cup broke.' (Givón 1976: 157)

Further, Swahili also features preposed objects, and they have an agreement relationship with the verb affix similar to that of subject agreement. The object *m-toto* 'child (class 1)' agrees with the interfix *kw* 'him (class 1)', as in (14-a), and the object *ki-kopo* 'cup (class 7)' agrees with *ki* 'it (class 7)', as in (14-b).

	- b. **ki-kopo**, cl7-cup ni-li-**ki**-vunja 1sg-past-3obj.cl7-break 'The cup, I broke it.' (ibid.)

Dryer (2013b)states that "[l]anguages in which pronominal subjects are expressed by pronominal affixes are widespread throughout the world." According to him, in 437 out of 711 languages, "pronominal subjects are expressed by affixes on verbs." Mian (Ok, Papua New Guinea) is one of those languages. As shown in (15), in Mian, the subject is expressed by the suffix *i*, and the object is expressed by the prefix *a*.

(15) **nē** 1sg naka=e man=sg.m a-temê'-b-**i**=be 3sg.m.obj-see.impfv-impfv-1sg.sbj=decl 'I am looking at the man.' (Fedden 2007: 261)

Givón (1976) argues that subject-agreement stems from topic-shift constructions like (12), while object-agreement originates from afterthought-topic constructions like (16), i.e., post-predicate constructions, at least in SVO languages.

	- c. Neutral I saw-**him** the man.

### 7 Discussion

Deaccented pronouns in Japanese can be interpreted as premature pronominal affixes.

### **7.3.1.2 Noun incorporation**

While focus elements tend to be produced pre-predicatively in a coherent intonation contour with the predicate in Japanese, I propose that, in languages with hard constraints, focus elements are incorporated into the predicate. In this section, I point out some similarities between focus elements in the predicate focus environment and incorporated nouns. Also, I discuss similarities between focus zero-coding and noun incorporation based on Mithun (1984). In noun incorporation, a nominal and a predicate form a unit; nominals and the predicate are phonologically, morphologically, and syntactically cohesive. According to Mithun (1984), zero-coding is the first stage of noun incorporation.

First, as Mithun (1984) states, typically incorporated nouns are indefinite and/ or non-specific, which are features that correlate with focus. Definite and/or specific nouns, which are closer to topics, are not incorporated into the verb. Examples are shown below from Onondaga. Woodbury (1975b: 11) states that "[i]t is generally agreed that a noun which is incorporated makes a more general reference than one which is free of the verb stem." In (17-a), the noun 'tobacco',which is not incorporated into the verb, refers to specific tobacco, and, as the translation shows, it is interpreted as definite. On the other hand, in (17-b), the incorporated noun 'tobacco' refers to tobacco in general rather than a specific tobacco, as the translation shows.

	- a. waP-ha-hnin´u-P tr-3sg-buy-asp neP the oy´EPkwa-P **tobacco**-n.s. 'He bought the tobacco.'
	- b. waP-ha-yEPkwa-hn´ı:nu-P tr-3sg-**tobacco**-buy-asp 'He bought tobacco.' (Woodbury 1975b: 10)

Similarly, in pseudo-noun incorporation in Niuean (Oceanic), definite nouns cannot be incorporated into the verb. Niuean is a VSO language; canonically, the object appears after the subject. On the other hand, incorporated objects appear after the verb (before the subject), which is how noun incorporation can be identified. Unlike in typical noun incorporation, incorporated nouns in Niuean can be accompanied by modifiers, as shown in (18). This is why Massam (2001) calls

7.3 Languages with hard constraints

this pseudo-noun incorporation. Note that the A argument *mele* is coded as absolutive instead of ergative.

(18) Niuean (Oceanic)


Niuean does not allow nouns coded by case markers or number articles to be incorporated because they are interpreted as definite and non-specific.


In Southern Tiwa, all inanimate direct objects must be incorporated, while animate direct objects are optionally incorporated (Allen et al. 1984). As shown in the contrast between (20-a) and (20-b), the inanimate object *shut* 'shirt' is incorporated, otherwise it is ungrammatical.

(20) Southern Tiwa (Tanoan)


(Allen et al. 1984: 293)

On the other hand, animate objects are only optionally incorporated, they are grammatical irrespective of whether they are incorporated or not, as shown in (21-a-b).

(21) a. ti-**seuan**-mũ-ban 1sg.A-man-see-past 'I saw the/a man.'

### 7 Discussion

b. seuanide man ti-mũ-ban 1sg.A-see-past

'I saw the/a man.' (Allen et al. 1984: 294-295)

Southern Tiwa is sensitive to animacy instead of definiteness. However Southern Tiwa is like Onondaga and Niuean in the sense that Ps with features correlating with focus are incorporated, while Ps with features correlating with topic can be not incorporated.

Second, while patient nouns tend to be incorporated into the verb, agent nouns are not incorporated (Mithun 1984; Baker 1988). In Southern Tiwa, for example, the patient Ss 'dipper' and 'snow' are incorporated in (22), while the agent S, 'dog', cannot be incorporated, as in (23).

### (22) Southern Tiwa (Tanoan)


This is parallel to Onondaga, as shown by the contrast between (24) and (25). Patient S is incorporated into the verb, as in (24), while agent S cannot be incorporated, as in (24-b). Glosses are based on Baker (1988: 87-89).


### 7.3 Languages with hard constraints

Mithun (1984: 875) argues that, verb-internally, incorporated nouns bear a limited number of possible semantic relationships to their host verbs. This applies no matter whether the language is basically of the ergative, accusative, or agent/ patient type. She proposes the following hierarchy of possible noun incorporations in different languages. Agent S and A are put in parentheses because they are not attested in Mithun's data. The hierarchy implies that languages which incorporate patient Ss can also incorporate Ps, but not necessarily vice versa.

(26) P > patient S (> agent S > A)

I point out that the hierarchy in (26) explains the variety of zero-coding crosslinguistically. According to Mithun (1984), simple juxtaposition of a noun (without any markers) and a verb is the first stage of noun incorporation. There are many examples of languages without P-coding discussed in the literature (Comrie 1979; 1983; Croft 2003; Aissen 2003; Haspelmath 2008: *inter alia*). In these languages, Ps with features correlating with topic, i.e., animate, human, and/or definite Ps, are overtly coded, while Ps with features correlating with focus are zero-coded. Some examples are discussed above as (8)–(9) in §7.2. Another example is from Russian, which has a special marker for animate (or human) Ps, but not for inanimate Ps. As shown in the following examples, *nosorog* 'rhinoceros' in (27-a), an animate P, is overtly coded by the direct object marker *a*, whereas *il* 'slime', an inanimate P, is zero-coded.


Examples for languages without P- and patient-S-codings are (Standard) Japanese and Lahu. In (Standard) Japanese, as discussed in §4.3.1, agent S tends to be coded overtly, as in (28-a), while patient S tends to be zero-coded, as in (28-b-c) (Kageyama 1993: 93).


### 7 Discussion

(29) and (30) are examples from Lahu. As shown in (29-a), the definite P 'the liquor' is coded with the accusative marker, while the indefinite P 'liquor' is not.

(29) Lahu (Tibeto-Burman) a. j`1 **liquor** th`a' acc d`O drink 'to drink (the) liquor' b. j`1 **liquor** d`O drink 'to drink liquor' (P) (Matisoff 1981: p. 307)

As shown in (30), the indefinite patient S is also zero-coded in Lahu (ibid.).


There are also languages which zero-code P, patient S, and agent S. In Kansai Japanese, for example, agent S can also be zero-coded in addition to P and patient S. (28-a) without *ga* is acceptable in Kansai Japanese (see also Nakagawa 2013).<sup>3</sup>

### **7.3.2 Elements separated from the predicate**

As discussed in §§6.2.1 and 6.3, topics that have not been established are produced intonationally separate from the predicate. This section explores the possibility of the existence of languages with hard constraints, i.e., languages that do not allow unestablished topics to appear together with the predicate or the main clause.

I did not find languages which match this exact condition. However, one of the related phenomena is that, in some languages, indefinite non-generic NPs cannot in general be the subject; they can only be the subject of existential constructions (Givón 1976: 173ff.). I assume that, in these languages, the connection between subject (A and S) and topic is so strong that non-topical subjects are not allowed.

<sup>2</sup>The expression *mû-yè* as a whole means 'rain (noun)'; which originates from *mû* 'sky' and *yè* 'water' (Matisoff 1981: 60).

<sup>3</sup>Although the form of the sentence is identical, the pitch accent is drastically different and it is easy to distinguish Standard Japanese from Kansai Japanese. Grammaticality judgements are of mine.

### 7.3 Languages with hard constraints

Canonical pre-verbal subjects in many Bantu languages are inherently topical and subjects cannot be foci in situ (see Downing & Hyman (2016) and works cited therein for a summary of information structure in Bantu languages). For example, in Northern Sotho, it is possible for the subject to appear in the canonical preverbal position, as in (31-a) or in the post-verbal position, as in (31-b).

	- b. Go cl17 fihla arrive **mo-nna** cl1-man Lit. 'There arrives a man.' (Zerbian 2006: 171)

It is ungrammatical to put *wh*-words in the canonical pre-verbal position, as shown in (32).


In many Bantu languages, it appears that an NP must be introduced in a special clause of non-canonical VS order and, only after that can the NP be mentioned in a normal clause of canonical SV(P) order to bring the narrative forward.

Further, in French, which is a SV(P) language, VS order is used to focalize the subject and the predicate typically expresses existence, emergence, and motion (Togo & Ohki 1986). Because the inverted subject is a focus, the scope of negation is the subject, as shown in (33-a), and it is unnatural to provide alternatives incompatible with the subject, as in (33-b).

(33) a. Dans in cet this immeuble building n'habitent not.live pas neg **des** some **ouvriers** workers **français**, French mais but des some ouvriers workers espagnols. Spanish 'In this building, French workers do not live, but Spanish workers do.'

b. ??Dans in cet this immeuble building n'habitent not.live pas neg **des** some **ouvriers** workers **français**, French mais but

### 7 Discussion

dans in l'autre the.other immeuble. building 'In this building, French workers do not live, but in the other building.' (Togo & Ohki 1986: 3, translated by NN)

It is infelicitous to put more new elements after the inverted subject. For example, (34-a), which is a typical subject inversion construction, is acceptable, whereas (34-b), which is (34-a) followed by another phrase 'by French and Japanese educators', is not acceptable.


Interestingly, however, if a pause is inserted between the VS part (*ont été discutés problèmes de l'éducation morale* 'problems of moral education were discussed') and the additional phrase (*par des pédagogues français et japonais* 'by French and Japanese educators') in (34-b), the acceptability improves. This suggests that a new NP is introduced in a special construction of VS order, and additional new information cannot be introduced within the same intonational phrase in French.

### **7.4 Summary**

This section outlined a summary of the study and discussed languages that grammaticalize the tendencies proposed in this study. Of course the discussion provided more possibilities than conclusions. Further investigation is needed to analyze the exact associations between languages with hard constraints and those with soft constraints. Also, it is intriguing to account for the factors that determine whether a language has hard constraints or soft constraints.

# **8 Conclusion**

### **8.1 Summary**

This study attempted to partially answer a larger question of how Japanese speakers communicate with each other through assumptions regarding the mental state of other people. It revealed that Japanese speakers employ a variety of cues to express the speaker's assumptions about the hearer's mental state.

While a substantial portion of the literature has discussed the distinction between *wa* and *ga*, the relationships among other kinds of particles have not been discussed as thoroughly. Chapter 4 in this study revealed the differences between *wa* and other topic particles such as*toiuno-wa* and *kedo/ga* preceded by copula, as well as the distribution of case markers, by drawing a semantic map of these particles. It also investigated the distribution of zero particles and their associations with information structure.

The previous literature investigated clause-initial, pre-predicate, and post-predicate constructions independently in different frameworks; however, there was no unified account of word order in Japanese. In Chapter 5, I described word order in spoken Japanese in a unified framework.

Chapter 6 investigated intonation. While the previous literature mainly concentrates on contrastive focus, this study discussed both topic and focus. I investigated intonation as a unit of processing and argued that information structure influences the form of intonation units.

To the best of my knowledge, particles, word order, and intonation in Japanese have been investigated separately in the literature; there was no unified theory to account for the all of these phenomena. This study investigated the phenomena as a whole in a consistent way by annotating the same information for all linguistic expressions and by employing the same analytical framework for all of them.

### **8.2 Theoretical and methodological implications**

This section discusses theoretical and methodological implications of this study. First, I proposed that topic and focus are multidimensional rather than homoge-

### 8 Conclusion

neous: they are interpreted as a bundle of features, where feature is scalar rather than binary. Different languages are sensitive to different features to different degrees. Even within a language, different linguistic expressions are sensitive to different features to various extents. Moreover, it is often the case that a single linguistic expression is sensitive to multiple features. As outlined in Chapter 2, different authors discuss different kinds of topic and focus, which is a confusing situation. I argue that linguistic research would be clearer if one asks "what feature(s) is/are sensitive to what linguistic expression(s)?", instead of asking "which feature best predicts the distribution of some linguistic expressions?"

Second, I proposed methods of annotation and analysis that are cross-linguistically applicable. I did not annotate all the features proposed in (2) in §3.3; however, all the features can be defined independent of language-specific categories and can be applied universally. Some features such as specificity and definiteness are hard to annotate, and it is highly likely that different annotators have different intuitions about the expression in question. I argue that this is not a problem. In real life, some people might interpret some expression to be definite, while other people might interpret the same expression to be indefinite. This is a source of linguistic variation, and there is no single right answer. Ideally, a statistically sufficient number of annotators annotate the same corpus, and all the annotations are used in analyses.

Third, I point out the importance of qualitative analysis in addition to quantitative analysis. In §4.2, for example, I concluded that *toiuno-wa* and *wa* attach to elements with different statuses of the given-new taxonomy by examining each example, even though the difference was not visible from the raw numbers. This is because my annotation is not fine-grained enough to capture the subtle difference between these markers. Of course, it is necessary to run statistical tests in the future. However, it is also important to examine each example to make sure that the quantitative results do not contradict other observations.

### **8.3 Remaining issues**

This study has left several issues open for future investigation. In this section, I discuss two of these issues.

### **8.3.1 Predication or judgement types**

As discussed in Chapter 2, traditional Japanese linguistics scholars have paid attention to predication types or judgement types. Predication or judgement

### 8.3 Remaining issues

types include the distinctions between thetic vs. categorical judgements and between attribute vs. phenomenon judgements (Matsushita 1928; Yamada 1936; Mio 1948/2003; Kuroda 1972; Masuoka 2008a; Kageyama 2012). Although this study focused on the distinction between nominal types such as topic and focus, the findings of this study can be integrated into theories of predication or judgement types. This implies that information structure is not only related to properties of NPs; rather, it is also associated with properties of predicates. Especially, grammatical categories such as tense, aspect, modality, and evidentiality are highly likely to be related to different information structure types. For example, as Masuoka (2012) points out, the topic marker *toiuno-wa* cannot be used in event predication (or stage-level predication); it can only be used in property predication (or individual-level predication).<sup>1</sup> This is shown in the contrast between (1-a) and (1-b). (1-a), where *toiuno-wa* is used in event predication with the simple past tense, is unacceptable. (1-b), on the other hand, where *toiuno-wa* is used in property predication, is acceptable.


Masuoka (2012) concludes that *toiuno-wa* is used only for property predication.

Moreover, it is well known that the interpretations of *wa* and *ga* change depending on predicate type (Kuroda 1972; Kuno 1973b). In property predication, *wa* is the default marker, and *ga* tends to be interpreted as exhaustive listing. As exemplified in (2-a-b), both of which are copular sentences (i.e., property predication), the sentence with *wa* (2-a) is considered to have a common topiccomment structure, while the sentence with *ga* (2-b) is considered to focus only John. Specifically, (2-b) is interpreted as the answer to the question 'who is a student?' In Kuno's terminology, *ga* is interpreted as marking exhaustive listing.


1 See §2.4.2.5 in Chapter 2 for the distinction between property vs. event predication.

### 8 Conclusion

In event predication, on the other hand, *ga* is the default marker and is interpreted as involving a neutral description while *wa* tends to be interpreted as contrastive. In (3-a-b), which involve event predication, the NP followed by *wa* in (3-a) is interpreted to be contrastive, while the whole sentence including the NP with *ga* in (3-b) is interpreted to have a broad focus structure; as above, in Kuno's terminology, *ga* is considered to be a neutral description.

	- b. ame-**ga** rain-*ga* hut-te fall-and i-masu prog-plt 'It is raining.' (ibid.)

I am aware of only a few studies investigating the question of why the sentences of particular information structure types are associated with specific predication types.

### **8.3.2 Genres**

Genres are also an important factor influencing the phenomena investigated in this study. As pointed out in §2.4.2.7, for example, the choice between zero vs. overt particles is sensitive to styles (casual vs. formal). However, it is not clear why the formal style requires overt particles more often than the casual style.

Further, I have argued that post-predicate constructions are more frequent in conversations than in monologues. Although I suggested a few possible suggestions as to why this is the case (§5.3), there is still no clear answer. Since there is a corpus of conversations annotated in the same way as the corpus used in this study (Nakagawa & Den 2012), it could be useful to compare the two corpora.

It is likely that in monologues like the ones employed here predicate-focus structures appears more frequently than in usual conversations; in narratives, the speaker usually talks about what s/he did or what happened to him/her, which fixes a topic (typically the speaker) – and fixing a topic elicits a predicate-focus structure. Moreover, because of the absence of hearers who ask *wh*-questions and who misunderstand what the speaker means, the speaker has to answer *wh*questions or correct the hearer less frequently, which is what typically elicits an argument-focus structure. This is another reason why it is important to investigate other genres of spoken language.



ysis and its application to Japanese corpus linguistics]. *Japanese Linguistics* 22. 101–122.





Abbott, Barbara, 79 Aissen, Judith, 265 Allen, Barbara J., 210, 263, 264 Allen, James F., 57 Aoki, Reiko, 26, 133 Ariel, Mira, 168, 172 Arnold, Jennifer E., 49 Backhouse, Anthony E., 44 Baker, Mark C., 209, 210, 264 Bamgbose, Ayo, 208 Beckman, Mary E., 57–60 Bildhauer, Felix, 67 Birner, Betty J., 4 Bolden, G. B., 54 Bolinger, Dwight, 4 Bresnan, Joan, 196, 259 Bybee, Joan, 4, 259 Büring, Daniel, 4, 19 Calhoun, Sasha, 4, 5 Carlson, Gregory, 35 Chafe, Wallace L., 4, 9, 12, 16, 27, 62, 63, 77, 81, 189, 213, 248 Chang, F., 49 Chiarcos, Christian, 4, 5, 67 Chomsky, Noam, 3, 14, 15, 18, 49 Chujo, Kazumitsu, 48 Clancy, Patricia, 31, 172 Clark, Herbert H., 71, 81, 111 Comrie, Bernard, 4, 43, 46, 70, 78, 79, 81, 88, 126, 129, 133, 137, 189, 256, 257, 265

Contini-Morava, Ellen, 260 Cook, Philippa, 67 Cowan, Nelson, 224 Cowles, Wind H., 4, 67 Croft, William, 68, 69, 91, 113, 265 Cruttenden, Alan, 189 Dahl, Östen, 258 Daneš, František, 7, 49, 150, 187 de Swart, Peter, 258 Den, Yasuharu, 1, 59, 85, 131, 146, 178, 184, 189, 191, 192, 194, 214, 223, 241, 272 Dik, Simon C., 10 Dipper, Stefanie, 67 Dixon, Robert M. W., 88, 126 Downing, Laura J., 19, 267 Downing, Pamela, 31 Dowty, David, 81 Dryer, Matthew S.,19, 48, 77,196, 210, 211, 261 Du Bois, John W., 48, 57, 59, 62, 79, 81, 133, 137, 145, 189, 257 Duranti, A., 10 Elam, Gayle Ayers, 58 Endo, Yoshio, 4, 49, 56, 201, 259 Enomoto, Mika, 1, 59, 131, 223 Enç, Mürvet, 79, 80 Erguvanli, Eser Emine, 196 Erteschik-Shir, Nomi, 3, 4, 10, 11, 16, 71, 75

Fedden, Sebastian, 261 Ferreira, Victor S., 4, 49 Firbas, Jan, 4, 7, 49, 150, 187 Fraurud, Kari, 258 Fretheim, Thorstein, 16 Frischberg, N., 258 Fry, John, 38–41, 43, 44, 257 Fujii, Noriko, 39 Fujii, Yoko, 51, 180 Givón, Talmy, 4, 7, 13, 16, 64, 70, 78, 79, 81, 94, 137, 141, 156, 168, 172, 183, 189, 205, 224, 236, 259–261, 266 Gundel, Jeanette, 4, 7, 10, 12, 13, 15– 19, 71, 150, 168, 172 Guo, Jie, 184, 191, 192, 194 Gussenhoven, Carlos, 61 Götze, Michael, 4, 5 Hajičová, Eva, 4, 5, 14 Hale, Ken, 258 Halliday, M. A. K., 4, 7, 13–16 Hara, Yurie, 29, 30 Harrison, Sheldon, 209 Haspelmath, Martin, 68, 91, 265 Haviland, Susan E., 81 Hepburn, A., 54 Heritage, John, 53 Heycock, Caroline, 10, 23 Hinds, John, 50, 51, 180 Hopper, Paul J., 4, 199, 259 Hyman, Larry M., 267 Igarashi, Yosuke, 57–59, 229 Iida, Ryu, 85 Imamura, Satoshi, 49, 50, 168 Ishihara, Shinichiro, 4

Ito, Kiwako, 61 Iwasaki, Shoichi, 59, 62–64, 191, 213, 214 Jacennik, Barbara, 196 Jackendoff, Ray, 3, 19 Jaeger, T. Florian, 43, 44, 46, 257 Jefferson, Gail, 54 Jorden, Eleanor H., 38 Kageyama, Taro, 42, 55, 127, 210, 265, 271 Karttunen, Lauri, 79 Kawahara, Daisuke, 241 Keenan, Edward L., 70, 78, 79, 81, 137 Keenan, Elinor O., 10, 168 Kikuchi, Yasuto, 4 Kinsui, Satoshi, 4 Kluender, Robert, 48 Koide, Keiichi, 36, 113, 124 Koiso, Hanae, 57, 85, 236 Kondo, Tadahisa, 49, 164 Kondo, Yasuhiro, 23 Kori, Shiro, 61 Krifka, Manfred, 4 Kruijff-Korbayová, Ivana, 3, 7 Kubozono, Haruo, 59, 60 Kuno, Susumu, 4, 10, 12, 15, 21–23, 26–29, 49, 50, 56, 57, 71, 79, 80, 98, 109–111, 124, 150, 180, 200, 201, 255, 259, 271 Kuroda, Shige-Yuki, 22, 23, 28, 30, 109, 271 Kurohashi, Sadao, 241 Kurosaki, Satoko, 45 Kurumada, Chigusa, 43, 44, 46, 257 Lambrecht, Knud, 4, 8, 10–12, 15, 74– 77, 125, 139, 140, 173, 199, 250

Ishimoto, Yuichi, 57, 236

Lee, Duck-Young, 39 Levinson, Stephen C., 53 Levy, Roger, 44, 46 Li, Charles N., 13, 16, 81, 133, 258 Liberman, Mark, 189 Longacre, Robert E., 141, 236 Maekawa, Kikuo, 58, 59, 68, 85 Makino, Seiichi, 44 Marshall, C. R., 71 Martin, Dorothy J., 141, 236 Maruyama, Takehiko, 147 Massam, Diane, 262, 263 Masuoka, Takashi, 4, 19, 23, 33, 35, 36, 106, 123, 271 Mathesius, Vilém, 4, 49, 150, 187 Matisoff, James, 266 Matsuda, Kenjiro, 39, 41, 257 Matsumoto, Kazuko, 48, 63, 64, 213, 214, 248 Matsushita, Daizaburo, 4, 8, 9, 14, 15, 27, 271 Maynard, Senko K., 31 Mazuka, Reiko, 48 McGregor, R. S., 129 Mikami, Akira, 4, 89, 139, 142 Miller, George A., 224 Minashima, Hiroshi, 43, 257 Minkoff, Seth, 258 Mio, Isago, 271 Mithun, Marianne, 16, 189, 209, 262, 264, 265 Miyamoto, Edison T., 48 Miyata, Koichi, 24 Morimoto, Yukiko, 196 Morita, Emi, 191 Murai, Michiyo, 57 Musan, Renate, 4

Nakagawa, Natsuko, 38, 46, 50, 52, 64, 85, 118, 124, 126, 132, 145, 178, 180–184, 257, 266, 272 Nakajima, Shin-Ya, 57 Nakamoto, Keiko, 47 Nihongo Kijutsu Bumpô Kenkyû Kai, 34 Nishida, Naotoshi, 23 Niwa, Tetsuya, 36, 40–45, 118, 119, 124, 165 Noda, Hisashi, 4, 24, 27, 47 Numata, Yoshiko, 24 Oberauer, Klaus, 224 Ochs, E., 10 Ohki, Mitsuru, 267, 268 Okubo, Takashi, 57 Okutsu, Keiichiro, 24 Ono, Susumu, 23 Ono, Tsuyoshi, 24, 39, 50–52, 180, 184, 190 Onoe, Keisuke, 4, 27, 31, 45 Perlmutter, David M., 55 Pierrehumbert, Janet B., 57–60, 189 Pitrelli, John F., 58 Pomerants, Anita, 53 Poppe, Nicholas, 256 Portner, Paul, 12 Poser, William J., 58 Prieto, Pilar, 189 Prince, Ellen, 4, 8, 9, 12, 16, 17, 78, 81, 82, 99 Reinhart, Tanya, 10–13, 17 Ritz, Julia, 5, 67 Rizzi, Luigi, 3, 49, 259 Rooth, Mats, 3, 17, 18 Russell, Bertrand, 3

Saeki, Tetsuo, 47 Sakuma, Kanae, 35 Sasaki, Kan, 38 Sato, Yo, 118, 124 Schegloff, Emanuel A., 53 Schieffelin, Bambi, 10 Selkirk, Elisabeth, 3, 61, 208 Sgall, Petr, 4 Shibatani, Masayoshi, 19, 20, 22, 23, 25, 37, 46–48, 50, 180 Shimojo, Mitsuaki, 38, 39 Silverman, Kim, 58 Silverstein, Michael, 43 Skopeteas, Stavros, 4, 67 Stalnaker, Robert C., 11, 15 Steedman, Mark, 3, 4, 7 Strawson, Peter F., 3, 10, 72 Sugito, Miyoko, 57, 65 Suzuki, Ryoko, 50–52, 180, 190 Suzuki, Satoko, 45 Suzuki, Shigeyuki, 24 Swerts, Marc, 57 Takahashi, Minako, 36, 113, 124 Takahashi, Shoichi, 48 Takami, Ken-Ichi, 50, 51, 56, 180, 186 Takubo, Yukinori, 19, 33, 34, 123 Tanaka, Akio, 22 Tanaka, Hiroko, 53, 54, 184, 194 Tateishi, Koichi, 42 Teramura, Hideo, 24, 27, 29, 30 Thompson, Sandra A., 199, 258 Togo, Yuji, 267, 268 Tokieda, Motoki, 4, 24 Tomasello, Michael, 77 Tomlin, Russell S., 4, 207, 208 Truckenbrodt, Hubert, 189 Tsutsui, Michio, 38–41, 44, 45, 93, 257

Ueno, Mieko, 48 Vallduví, Enric, 4, 18 Vallduvı, Enric, ́ 4, 7, 11 Venditti, Jennifer J., 57–59 Vilkuna, Maria, 4, 18 von der Gabelentz, Georg, 7 von Stechow, Arnim, 17 Ward, Gregory, 4 Watanabe, Michiko, 178 Watanabe, Minoru, 191 Watanabe, Yasuko, 31 Woodbury, Hanni J., 262, 264 Yamada, Yoshio, 4, 22, 25–27, 271 Yamashita, Hiroko, 49, 164 Yamashita, Yoichi, 57 Yasuda, Akira, 23 Yatabe, Shuichi, 42 Yoshita, Hiromi, 49 Zerbian, Sabine, 267

# **Language index**

Chinese, 33 Czech, 4 Dutch, 61 English, 17, 19, 22<sup>5</sup> , 27<sup>12</sup> , 33, 36, 41, 49, 58, 61, 71, 74–76, 139, 199, 213 French, 17, 208, 209, 267, 268 Georgian, 144, 145 German, 32, 61, 140, 199 Hindi, 129, 132, 256 Hixkaryana, 19 Iranian, 168 Iroquoian, 189, 262, 264 Kansai Japanese, 145, 257, 266, 266<sup>3</sup> Lahu, 265, 266 Mam-Maya, 258 Mian, 261 Mokilese, 209 Mongolian, 256, 257 Navajo, 258 Niuean, 262–264 Onondaga, 262, 264 Polish, 196 Russian, 19, 45, 265 Ryukyuan, 175<sup>6</sup> Siouan, 189 Spanish, 267 Swahili, 260, 261 Turkish, 19, 79, 196 Yoruba, 208, 209

aboutness, 5, 11, 12 accentual phrase, 58, 222 accentual-phrase boundary, 58, 59, 118 accusative, 20, 22, 79, 140, 141, 145, 256, 257, 265 accusative marker, 20, 25, 200<sup>13</sup> , 257, 266 activation cost, 52, 54, 77, 143, 157, 168, 184, 188, 189, 222, 247, 256 activation status, 50, 78, 84, 88, 91, 258 adjective, 60, 68, 69 adverbial, 19, 22, 22<sup>5</sup> , 27<sup>12</sup> , 53, 142, 165 adverbial particle, 22, 26, 27 affix, 47, 258, 260, 261 agent-like argument, 19, 88, 120, 126 agentivity, 81, 126, 129, 131, 142, 144 anaphoric, 12, 27–29, 84–88, 92, 93, 96, 98, 100, 111, 115, 137, 147, 148, 150, 153, 154, 161, 183<sup>8</sup> , 195, 196, 198, 202, 214, 216, 218, 236, 237 anaphoric distance, 143 anaphoric element, 156, 224, 236 animacy, 44, 81,129,131,142,144, 254, 257, 259, 264 animate, 5, 43, 43<sup>21</sup> , 70, 81, 128, 129, 131, 207, 254, 256–259, 263, 265

antecedent, 31, 52, 84, 86, 87, 92, 104, 143, 148, 169, 171, 173, 173<sup>5</sup> , 174, 177, 183, 222 argument focus, 23, 39, 126, 127, 254 argument-focus structure, 199, 272 broad focus, 23, 61, 62, 64, 76, 132, 255, 272 case marker, 22, 24, 36, 79 case particle, 4, 126, 136 clausal, 62, 64, 93, 94, 148, 213, 216– 218, 220, 222, 224, 227, 229– 232, 234, 238, 240, 247–249 clitic, 21<sup>4</sup> , 189, 224 conceptual space, 67–70, 92, 103, 113, 126, 142, 143 continuity principle, 160, 161, 199, 200, 207, 208, 210, 211 contrastive focus, viii, 126–128, 132, 269 contrastive topic, 18, 145 contrastiveness, 15, 17–19, 24<sup>7</sup> , 32<sup>15</sup> , 126, 142, 144 copula, vii, 36, 91–93, 98, 104, 113, 116, 117, 142, 164, 164<sup>2</sup> , 221, 254, 269 definiteness, 3, 5, 89, 99, 129, 135, 256, 257, 264, 270 demonstrative, 184, 194, 219, 220 detached topic, 139–141

direct object, 129, 256, 265 discourse, 3, 8, 9, 11, 12, 14, 19, 27, 28, 36, 45, 46, 50, 57, 64, 70, 73, 76, 78, 81, 83, 84, 86, 87, 94, 99, 105, 110, 111, 113, 120, 124, 135, 145, 148, 154, 158, 166, 168, 171, 176, 188, 214, 216, 238, 250, 255, 257 downstepping, 58, 60, 60<sup>30</sup> emotive, 190, 192 falling intonation, 192, 220 first mora,143,169, 219, 220, 222, 232, 243, 245 focus, 5, 13–15, 91, 258 focus marker, 23, 24, 24<sup>7</sup> grammatical relation, 189, 243, 247 hearer, 3, 8, 9, 11, 12, 14–17, 34, 45, 70– 74, 77, 77<sup>1</sup> , 78–84, 86, 92, 98, 105, 106, 108, 109, 116–118, 120, 123–125, 141, 146, 156, 166, 169, 172, 186, 192, 194, 223, 224, 241, 242, 247, 254, 255, 269 implicature, 30, 190, 192 inanimate, 5, 43, 43<sup>21</sup> , 70, 81, 129, 131, 207, 254, 263, 265 indefinite, 3, 5, 17, 28, 41, 43, 70, 73, 78, 78<sup>3</sup> , 79, 83, 129, 135, 136, 166, 196, 198, 205, 207, 209, 254, 256, 266, 270 inferable, 8, 9, 70, 71, 80–82, 88, 92, 93, 98, 102, 103, 106, 108–111, 114, 116, 120, 121, 124, 137, 142, 143, 149, 156, 157, 161,

164, 168, 169, 207, 217, 218, 220, 238, 247, 250, 254, 255 information status, 21, 84, 87, 92, 93, 147, 148, 150, 195, 199, 202, 216 information structure, vii, viii, 1, 3–5, 7, 8, 13–15, 18, 19, 23, 26, 46, 47, 54, 57, 64, 65, 67, 69, 70, 74, 76, 133, 147, 160, 180–182, 199, 200<sup>13</sup> , 202, 206, 210, 211, 213, 214, 240, 242, 243, 247, 248, 255, 267, 269, 271, 272 interactional, 47, 53, 74, 184, 190, 191, 191<sup>11</sup> , 192, 194, 206, 255 intermediate phrase, 60 intonation contour, 50, 52, 62, 99,118, 255, 260, 262 intonation unit, 58, 59, 62–64, 189, 191, 191<sup>11</sup> , 200, 201, 206, 209, 213, 230, 236, 247, 248, 255 intonational phrase, 58–60, 268 intonational-phrase boundary, 58, 59 intransitive, 19, 55, 88, 126, 131, 144 low activation cost, 188, 189, 247 low pitch, 189, 222, 237, 238, 241 main clause, 118, 142, 181, 185–187, 266 monologue, 199, 248, 250 narrow focus, 23, 40, 61, 64, 76, 91, 132, 254 newness, 16, 57 nominative, 20, 25, 26, 36, 130, 131, 133, 141, 145, 257 nominative case, 22, 25–27, 133 non-anaphoric element, 101, 115, 237

non-contrastive focus, 127, 128, 132, 144 non-nominative focus, 130, 131, 133, 145 noun incorporation, 209, 210, 262, 263, 265 piano, 79, 80, 101, 104, 111, 114, 220, 230, 235 pitch, 21, 29, 58, 59, 60<sup>30</sup> , 62, 63, 65, 181, 189, 200, 222–224, 227, 235, 236, 241, 243, 245, 245<sup>7</sup> Pitch accent, 21 pitch accent, 21, 181<sup>7</sup> , 201<sup>14</sup> , 224, 224<sup>2</sup> , 266<sup>3</sup> pitch contour, 58, 192, 219, 220, 222, 229, 230, 232, 233, 235, 236, 238 pitch peak, 17–19, 58, 60, 61, 224, 238 pitch range, 58, 59, 61, 201, 222, 224, 227, 229<sup>3</sup> , 232, 233 pitch reset, 58, 60<sup>30</sup> , 65, 219, 220, 223, 232, 236, 237, 243, 245 post-predicate construction, 181, 190, 192 post-predicate element, 182, 190, 192 post-predicate position, 189, 191 postposed construction, 52, 194 postposed constructions, 52 postposed element, 51, 52, 180, 181, 184–187, 206 postposed elements, 52 postposed part, 50, 185 predicate focus, 23, 77, 79, 255, 262 predicate-focus context, 243, 245 predicate-focus structure, 31, 75–77, 79, 84, 210, 245, 272 predication, 15, 35, 35<sup>16</sup> , 36, 68, 69, 106, 270, 271, 271<sup>1</sup> , 272

presupposition, 3, 14, 15, 73, 74, 79, 82, 83, 258 production experiment, 49, 64, 213, 240, 247 pronominal, 189, 210, 261, 262 pronoun, 76, 86, 125, 140, 140<sup>14</sup> , 141, 144, 169, 172, 173<sup>5</sup> , 174, 176, 184, 210, 224, 253, 260 rising intonation, 192 second mora, 241, 243 semantic map, 68, 69, 91, 103, 126, 142, 269 semi-active state, 9, 78 spontaneous speech, 59, 64, 84, 174 thetic, 30–32, 271 topic, v, vii, viii, 1, 4, 5, 7–10, 10<sup>2</sup> , 11– 20, 22, 23, 24<sup>7</sup> , 27, 32, 34– 37, 41, 41<sup>20</sup> , 43, 49, 52, 64, 67, 70–72, 74–82, 82<sup>5</sup> , 83– 85, 89, 91, 92, 94, 98, 99, 99<sup>6</sup> , 101, 102, 104, 111, 113– 118, 120–125, 133, 136, 137, 139, 141–143, 145, 152, 156, 157, 160, 161, 166–169, 171, 171<sup>4</sup> , 173–179, 199, 200, 207, 209, 218–224, 227, 229, 230, 234, 241, 245, 247, 249, 250, 254–258, 260, 261, 264–266, 269–272 topicality, 202, 205, 206 topics, 7, 81 transitive, 19, 55, 56, 120, 210 transitive clause, 25, 88, 126, 144, 207, 209, 210 transitive verb, 55, 208 unaccusative, 42, 55–57

unergative, 42, 55 utterance, 3, 12–18, 51, 54<sup>27</sup> , 58, 71– 74, 78, 82, 82<sup>5</sup> , 84, 86, 101, 105<sup>7</sup> , 106, 122, 124–126, 150, 153, 178, 181, 184, 185, 188, 191, 192, 223, 240, 243, 249 verb, 20, 44, 55, 56, 137, 140, 180, 194, 208–210, 241–243, 245, 247, 248, 256, 258–262, 264, 265 verb-final language, 56, 180, 255 vowel, 21, 63, 208, 223, 236, 238, 242,

247 word order, vii, 2, 3, 6, 7, 14, 20, 22, 46–50, 53–55, 57, 65, 69, 89,

93, 93<sup>2</sup> , 94, 141, 147, 148, 150, 156, 160, 161, 168, 174, 176, 183, 189, 190, 192, 200<sup>13</sup> , 202, 205, 207, 210, 211, 216, 240, 253–259, 269

zero particle, 20, 38, 39, 43, 45, 72, 92, 99, 99<sup>6</sup> , 118

# Information structure in spoken Japanese

This study explores information structure (IS) within the framework of corpus linguistics and functional linguistics. As a case study, it investigates IS phenomena in spoken Japanese: particles including so-called topic particles, case particles, and zero particles; word order; and intonation. The study discusses how these phenomena are related to cognitive and communicative mechanisms of humans.